Posted on

Gemini AI is replacing Google Assistant on most mobile devices in 2025

Nearly a decade after Google Assistant arrived, the software is being sunset as Gemini AI takes its place. On Friday, Google announced that before the end of 2025, Gemini will replace Google Assistant on most mobile devices. According to 9to5Google, the only exceptions will be devices running Android 9 or earlier with less than 2GB of RAM.

Google claims that millions of people have already made the switch and use Gemini instead of Google Assistant. Now, Google’s bringing everyone else up to speed, whether they like it or not. The company explained that most users will automatically be upgraded to Gemini in the coming months, and by the end of 2025, “Google Assistant will no longer be accessible on most mobile devices or available for new downloads on mobile app stores.”

It won’t just be your phone, either. Gemini is coming to tablets, cars, watches, and headphones. There is also a Gemini-powered experience in the works for home devices like speakers, smart displays, and TVs. Basically, if it connects to the internet, it’s getting Gemini.

As the widespread switch from Google Assistant to Gemini begins, Google says it is “continuing to focus on improving the quality of the day-to-day Gemini experience, especially for those who have come to rely on Google Assistant.”

Tech. Entertainment. Science. Your inbox.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

Google linked to a blog post comparing the two for those curious about how they differ and the features they share. Gemini can now do almost everything Google Assistant can, from providing weather forecasts and creating calendar events to searching for flights and sending messages. The biggest difference is that Gemini is powered by AI.

The timing couldn’t be worse for Apple, which delayed its AI-powered Siri reboot in a failure that the company is referring to as “ugly and embarrassing” internally. While other phone makers are moving full speed ahead with AI, Apple is stuck in neutral.

Source

Posted on

This new AI voice demo will blow your mind

AI has been developing at an accelerated rate over the past year and a half. We’ve seen major leaps in the advanced capabilities of services like OpenAI’s ChatGPT and advancements in Google’s Gemini AI. But now, one AI voice model outdoes them all. Meet Sesame, a new AI voice model designed around delivering “voice presence” that feels like you’re talking to a real person.

To call the results amazing would be a bit of an understatement. The team at Sesame launched an online demo version of its AI model on the company’s website, where you can chat with the AI as one of two personas—Miles or Maya. Both offer distinct voices for the AI, and both can respond in ways you won’t believe without hearing it yourself.

And so far, people are really taking to Sesame and its capabilities. We’ve already seen some amazing interactions between people and the AI—like an interaction between a Reddit user and the Miles voice, where the user tells the AI to act like a boss being confronted about a secret.

In the video, you can clearly hear how Sesame’s AI model responds quickly to what the user is saying, and while the poster did mention editing the piece down some, they mostly edited down some of their own fumbling, as well as the bit where they told the AI how to react.

Tech. Entertainment. Science. Your inbox.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

Others chimed in in the comments about how they tested it themselves, with one mentioning that they were able to get it to respond in quicker and even wittier fashion than the interaction showcased in the video. But that doesn’t downplay how crazy this interaction is on its own—or how promising (and terrifying) this technology is.

We’ve always known that AI voice models were going to be the most dangerous. But if Sesame is able to deliver such a realistic and believable voice presence in a demo like this, it’s hard to imagine what would be possible in a fully fleshed-out version of the model.

You can try out Sesame for yourself by heading over to the company’s website and choosing one of the two demo models available. Having tried it out myself, it’s remarkable how easily it can move between normal, intelligent conversation and more specified roleplay situations like those showcased by users on Reddit.

Many of us have been waiting for the moment that AI truly changes everything. While ChatGPT and other services have been promising, Sesame is probably the most promising opportunity I’ve ever personally experienced in the AI revolution, and I’m excited—and cautiously optimistic—about whats to come next.

Source

Posted on

US Congress demands UK lifts gag on Apple encryption order

US lawmakers have hit out at the Home Office for “attempting to gag” US companies by preventing them from telling Congress whether they have been subject to secret UK orders requiring them to hand over their users’ data.

In an unprecedented intervention, five lawmakers from both sides of the US political divide, led by senator Ron Wyden, have written to the UK’s Investigatory Powers Tribunal (IPT) accusing the British government of undermining Congressional oversight and restricting the free speech of US companies.

Their letter comes as the IPT is preparing to hear closed-door arguments from Apple, which is challenging a notice requiring it to extend UK law enforcement’s existing access to encrypted data stored by customers on the Apple iCloud service anywhere in the world to users of Apple’s Advanced Data Protection (ADP) who choose to hold encryption keys privately on their own devices.

British media organisations, including the BBC, The Times, Financial Times, Reuters, The Guardian, The Telegraph and Computer Weekly, have also filed legal submissions with the IPT today, arguing that there is an important public interest in hearing arguments over the UK’s demands against Apple in a public court.

In the Congressional letter, five US senators and congressmen complained to the Investigatory Powers Tribunal that the secrecy surrounding the orders – known as Technical Capability Notices (TCNs) – are impairing Congress’s power and duty to conduct oversight on matters of national security.

The letter disclosed that Apple and Google have informed Congress that were they to have received Technical Capability Notices, they would be barred by UK law from disclosing it to US lawmakers. The UK embassy has also failed to respond to US requests about potential demands by the UK to other US companies.

“By attempting to gag US companies and prohibit them from answering questions from Congress, the UK is both violating the free speech rights of US companies and impairing Congress’s power and duty to conduct oversight on matters of national security,” the lawmakers wrote.

“The UK’s attempted gag has already restricted US companies from engaging in speech that is constitutionally protected under US law and necessary for ongoing Congressional oversight,” they added.

The letter has been signed by democrats senator Ron Wyden from Oregon, who has campaigned for healthcare and the environment; Alex Padilla from California, who is chairman of the Senate Judiciary Subcommittee on Immigration; and Zoe Loefgren, an advocate for digital rights from California.

By attempting to gag US companies and prohibit them from answering questions from Congress, the UK is both violating the free speech rights of US companies and impairing Congress’s power and duty to conduct oversight on matters of national security Congressional letter to the Investigatory Powers Tribunal

Republicans Andy Bigg from Arizona, chair of the House Judiciary Subcommittee on Crime and Federal Government Surveillance and a vocal trump supporter; and Warren Davidson for Ohio, a member of House Financial Services Committee and a former US soldier, have also signed.

Their unified complaint calls on the IPT to apply principles of open justice to the hearing scheduled for Friday, and for all subsequent proceedings in Apple’s appeal against the Technical Capability Notice. 

The lawmakers note that the existence of the TCN has been widely reported and commented on, which makes any argument for closed hearings to keep the existence of the notice secret “unsustainable”.

The existence of the notice has also been confirmed by Apple’s public decision to withdraw its advanced encryption option, known as Advanced Data Protection, for all UK users. Apple would not have done this “unless it felt compelled to do so by a request to insert a backdoor”.

Holding public hearings would allow lawmakers to hear expert evidence from cyber security specialists, civil society representatives and experts on US-UK data flows, enabling the IPT to reach a well-informed decision over the lawfulness of the notice, they said.

Serious concerns over national security 

The lawmakers argue that the UK’s demands against Apple raise “serious concerns which directly impact national security” and therefore warrant public debate. 

As Computer Weekly previously reported, Tulsi Gabbard, the director of national intelligence, stated in a letter to Congress that the UK’s demands would be “a clear and egregious violation of American’s privacy and civil liberties, and open up a serious vulnerability for cyber exploitation by adversarial actors”.

President Donald Trump confirmed in an interview with The Spectator that he had raised the Apple TCN with prime minister Keir Starmer during his visit to Washington, comparing the UK’s actions to the conduct of China.

Chinese exploited US ‘lawful access’

The lawmakers point out that the security of US technology products against surveillance by foreign governments is an important topic for ongoing Congressional oversight following a spate of hacks against the communications of senior US government officials.

China exploited US lawful interception systems in 2023 to reportedly tap the phone calls of Trump and vice-president JD Vance, and to steal millions of phone records after gaining access to major US carriers in the “Salt Typhoon” attack.

In April 2024, hackers stole phone records of “nearly all” AT&T customers, including records of members of the president’s family, the then vice-president, Kamala Harris, and the wife of the now secretary of state, Marco Rubio, in the “snowflake” incident.

And in 2003, China stole more than 60,000 emails from the department of state and compromised the email accounts of US officials and politicians after hacking into Microsoft-hosted US government email accounts.

“The common link between these incidents is that sensitive government data held by third-party companies was not properly secured and subsequently accessed by hackers … most importantly, the Salt Typhoon incident reportedly involved compromising ‘lawful intercept’ systems of the kind that it appears Apple has been ordered to build,” the letter states.

“Given the significant technical complexity of this issue, as well as the important national security harms that will result from weakening cyber security defences, it is imperative that the UK’s technical demands of Apple – and of any other US companies – be subjected to robust, public analysis and debate by cyber security,” the lawmakers wrote.

Vital for US cyber security experts to comment

“Secret court hearings featuring intelligence agencies and a handful of individuals approved by them do not enable robust challenges on highly technical matters. Moreover, given the potential impact on US national security, it is vital that American cyber security experts be permitted to analyse and comment on the security of what is proposed.”

The Home Office’s shocking order to Apple to break encryption represents a huge attack on privacy rights and is unprecedented in any democracy Rebecca Vincent, Big Brother Watch

The lawmakers invited the tribunal to permit US companies to discuss the technical demands they have received under the UK’s Investigatory Powers Act with Congress. The IPT should “invite robust public debate by independent cyber security experts before deciding the merits of the reported challenge that Apple has brought”, they said.

Separately, civil society groups Big Brother Watch, Index on Censorship and Open Rights Group have written to the president of the Investigatory Powers Tribunal, the Rt Hon Lord Justice Singh, calling for the case to be made public.

They argue that the case implicates the privacy rights of millions of British citizens who use Apple’s technology, and those of its overseas customers.

There is a “significant public interest in knowing when and on what basis the UK government believes that it can compel a private company to undermine the privacy and security of its customers”, according to the letter.

Big Brother Watch interim director Rebecca Vincent said the tribunal hearing must not take place in secret. “The Home Office’s shocking order to Apple to break encryption represents a huge attack on privacy rights and is unprecedented in any democracy,” she said.

Index on Censorship CEO Jemimah Steinfeld said breaking encryption would do away with our rights to privacy, make us far less safe and secure online, and challenge the very notion of the UK as a democracy. “With such high stakes, we demand to know what could possibly justify this. We need answers, not more secrecy,” she said.

Open Rights Group executive director Jim Killock said: “If the UK wants to claim the right to make all of Apple’s users more likely to be hacked and blackmailed, then they should argue for that in an open court.”

Source

Posted on

The Security Interviews: Yevgeny Dibrov, Armis

Over the past 20 to 30 years, the intelligence community has generated a stream of cyber security leaders – private cyber security companies are littered with former operatives of the American and British intelligence services.

But in Israel’s case, the intelligence-to-cyber pipeline has produced arguably the highest density of cyber security startups and organisations in the world. The likes of Check Point, CyberArk, Imperva, Palo Alto Networks and Radware can all claim links back to the Israel Defence Force’s (IDF’s) technology units.

Among these units, which likely date back to before Israel’s founding in 1948, are the highly secretive cyber weapons and tech development shop Unit 81, and the more widely known signals intelligence Unit 8200.

Israel’s astonishing concentration of cyber security talent is largely attributable to both Unit 81 and Unit 8200, whose existence has only been fairly recently acknowledged. Mossad may get international attention, but it is Unit 8200 that gets the data to support it and Unit 81 that builds the tech.

Acting as incubators for cyber security and hacking talent, these units benefit from Israel’s compulsory military service laws and intensive screening processes, which divert individuals with potential from frontline armed service, although they also scout after-school computer clubs for likely-looking candidates.

That the IDF is the wellspring of Israel’s cyber talent is these days no secret, but Armis CEO, Yevgeny Dibrov – who is allowed to say little more about the time he served in Unit 81 beyond the fact that he was there – says there’s more to the growth of Israel’s cyber community than just the hothouse conditions at the IDF.

He compares the environment to that of a startup. “When you’re a startup, when you’re building something, you don’t have much budget, but with what you have you still need to do outstanding things that differentiate a lot, that achieve a lot, and that puts you in a great place.

“We don’t have the same budget as the CIA or the NSA, maybe point one of a percent, but we have no choice. There is no other way,” he explains. “We have a lot of enemies and we want to win.”

Make the impossible possible

At first. Dibrov’s pipeline into the IT industry does not seem all that different from most other people’s – stemming from an initial schoolboy interest in computers, maths and physics – but he became hooked when he was tapped for Unit 81 as a fresh-faced teen.

“In the years I spent there I became fascinated by different capabilities, fascinated by this world, fascinated also by working hard for my country,” he says. “Twice during my service I was part of the team that won the Israel Defence Prize, which is for outstanding achievements in the technology space.

“The slogan of our unit was ‘Make the Impossible Possible’,” says Dibrov. “It’s written over the door when you enter. You see it every day, and so you kind of live towards it. It’s not just a cliché.”

Twice during my service [at Unit 81] I was part of the team that won the Israel Defence Prize, which is for outstanding achievements in the technology space Yevgeny Dibrov, Armis

But the intelligence forces serve not only as a hub for creative talent, but a hub for team-building. Indeed, of Armis’s first cohort of employees, about 50% served alongside Dibrov himself at Unit 81, and the others worked alongside his co-founder – and chief technology officer (CTO) – Nadir Izrael at Unit 8200.

“People get to know each other, and during my time at Unit 81, we were always talking to alumni that actually started companies and did great things,” says Dibrov. “I remember my team leader in the army was [Wiz CEO] Assaf Rappaport, so we were always meeting some of the alumni from our unit and learning what they had done.

“It makes you excited,” he says. “It makes you think, ‘Okay, when I’m out, here is what I want to do’. I already knew that I wanted to start a company.”

Alongside heading off to study at Technion, the Israel Institute of Technology, between 2010 and 2013, at the end of his service, Dibrov helped set up Adallom, with which Rappaport was also involved. Adallom was a cloud access security brokerage (CASB) specialising in visibility, governance and protection across business applications such as Box, Google Apps, Microsoft Office 365 and Salesforce.

The firm’s Office 365 work clearly stood out, because in September 2015, Microsoft bought the company for over $300m. Just a couple of months later, Dibrov and Izrael started Armis, with the first employees coming on board in February 2016.

Google Maps, but for vulnerable assets

Asked to “explain like I’m five”, Dibrov describes Armis as a cyber exposure management platform that essentially provides its customers with a Google Map of their IT environment, with every single asset accounted for, whether it’s something run-of-the-mill like a laptop or smartphone, to operational technology (OT) like industrial controllers, even medical equipment.

On top of this basic map, Armis provides additional layers covering security risk discovery, monitoring and management, and ultimately, remediation.

“We want to not just allow you to see your risk, but reduce it, whether through patching devices or mitigating threats with different rules in your technology environment,” he says.

Armis was earlier than many to the OT/internet of things (IoT) side of security, mapping it as a factor early on in its history, before the topic really started to hit mainstream security conversations about six or seven years ago. What was the spark that led Dibrov to make this bet?

“We really started from talking to a lot of customers, talking to a lot of CIOs, and we were hearing about the explosion of connected devices,” he explains. “We looked at the variety of different environments and we saw there was a gap.

“On the one hand, you have laptops and servers that are covered by your antivirus or next-gen antivirus, and then you have everything else. And then everything else changes in different industries. If you look at an airport, they have a big gap around a lot of operational technology stuff. They have different distribution centres, logistics centres and more. They have datacentres. They have buildings with building management systems.”

At about the same time, incidents such as NotPetya and WannaCry were exposing the precarious security of such environments – particularly in healthcare settings – and this helped push people towards a more holistic view of cyber security.

Security teams have no idea what cameras they have, and they’re 90% Chinese, potentially exploited with backdoors, and often in the most critical environments Yevgeny Dibrov, Armis

“It was a huge push across the board,” says Dibrov. “Everyone suddenly understood that they needed to have visibility into what they have in these environments – because imagine if I’m an attacker, why would I attack a laptop if the laptop has 50 agents on it? I attack the most vulnerable thing, and that’s usually devices that don’t run any agents or antivirus, devices that are mostly not updated or cannot be patched, and a bunch of old XP machines in those areas.

“These devices are often the most important in the organisation. Look at a hospital. How can you compare the importance of a laptop versus an MRI scanner?”

Customers took to this like ducks to water, and today Armis works with over 35% of the Fortune 100.

From day-to-day there is no such thing as a typical customer, says Dibrov, but they tend to be larger, distributed organisations with highly complex environments and a lot of devices. Armis claims currently to have approximately 5.3 billion connected devices in harness.

What’s the weirdest ‘thing’ he ever found? “We have things like cars that connect to the company network, to wireless air fryers – we see those a lot. And the amount of types of cameras you would never believe,” says Dibrov. “Security teams have no idea what cameras they have, and they’re 90% Chinese, potentially exploited with backdoors, and often in the most critical environments.”

Like many of its peers, Armis has also been branching out into threat research and frequently publishes its own thought leadership on diverse topics – recent ones include breaking down CISA’s most exploited vulnerabilities and the emergence of DeepSeek.

“We have so much data now, and our customers can benefit from that,” says Dibrov. “We also acquired a company in the space, some super-talented guys who merge a lot of their own data with data we generated to provide early warning, which has been very significant.”

What’s next?

Keeping in touch with Armis’s buyers is a source of pride for Dibrov, who makes a point of frequently checking in with his user advisory board and speaking to six or seven individual customers every day, whether those are long-term existing ones, new ones, or those moving through their procurement or onboarding processes.

“What do they need? What do they think like? What do we need to do different?” says Dibrov. “This is something that is ongoing for us – always listening, always developing, always running fast, and always providing real solutions to real problems.”

Dibrov declares himself particularly paranoid when it comes to the competition, and likes to try to think about 18 months ahead in terms of innovation. “This is something that is always on my mind because that’s the biggest differentiator,” he says. “You need to have first of all the best product, and then to execute from there. That’s what keeps me up at night.”

Armis recently closed a large Series D funding round, raising $200m to take it to a total valuation of over $4bn. And having made two acquisitions in the past 12 months – Silk Security in April 2024 and CTCI in February 2025 – Dibrov is open to more, as well as exploring the possibility of an initial public offering (IPO).

Beyond these goals, Dibrov is, of course, keeping a close eye on the developing threat landscape. His views on where things are going tally with those of many other observers.

“We keep seeing a lot of state actors, from Russia, China, North Korea, Iran. We keep seeing them, and we keep seeing a lot of targeting of EMEA and US critical infrastructure and manufacturing,” he says. “We see them sometimes also leveraging AI [artificial intelligence]. My guess is we’ll see that more and more, and defenders really need to be prepared.”

Source

Posted on

Apple’s big AI-powered Siri upgrade was just delayed to 2026

The long-anticipated personalized Siri allegedly coming with iOS 18.4 has now been delayed to 2026. To Daring Fireball, Apple’s spokeswoman Jacqueline Roy said the more personalized Siri experience powered by Apple Intelligence will take longer to be released.

Here’s what she said: “Siri helps our users find what they need and get things done quickly, and in just the past six months, we’ve made Siri more conversational, introduced new features like type to Siri and product knowledge, and added an integration with ChatGPT. We’ve also been working on a more personalized Siri, giving it more awareness of your personal context, as well as the ability to take action for you within and across your apps. It’s going to take us longer than we thought to deliver on these features, and we anticipate rolling them out in the coming year.”

Bloomberg‘s Mark Gurman had already teased that some of the more personalized Siri features for Apple Intelligence could have been delayed. At the time, the journalist said that the most impressive functions could launch as soon as 2027.

In his Power On newsletter, he revealed that it’s going to take at least two extra years before Apple Intelligence gets somewhat similar to the capabilities OpenAI’s ChatGPT, Google’s Gemini, and Microsoft’s Copilot can deliver today—and, honestly, for at least a year now.

Tech. Entertainment. Science. Your inbox.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

According to the journalist, Apple has a long schedule to finally revamp Siri and make it an essential part of the Apple Intelligence platform. This is what you can expect:

  • iOS 18.4: Expected for early April, Apple is expanding the languages available with Apple Intelligence;
  • iOS 18.5: Expected for May, Gurman expected Apple to make Siri tap user data to make it more personalized, but this might have now been scrapped to 2026;
  • iOS 19.4: Expected around April-May of 2026, Siri is getting a new architecture that can operate legacy Siri commands while handling more advanced queries in the same flow;
  • iOS 20: Believe it or not, Gurman’s forecast goes up until 2027, when Apple might be finally able to fix Siri and deliver the LLM Siri, which was technically supposed to be revealed this June.

That said, Apple Intelligence will take much longer to become useful. With that in mind, we now wonder what Apple will do to improve its AI platform.

Source

Posted on

Google is testing Gemini AI in Google Calendar

Google is officially testing Gemini integration with Google Calendar. If you’ve been holding out hope that the Google-powered AI would make the jump to your calendar, you likely won’t have to wait much longer.

The feature is currently only available in Workspace Labs, which is essentially Google’s “beta” program for new workspace features that will soon be available in Calendar, Gmail, and the rest of its online workspace apps.

Based on the details outlined in Google’s announcement of the new integration, it looks like only basic commands and prompts are available at the moment. You can ask Gemini to add events, provide details about events, and other things like that. It’s basically everything you’d want to ask an AI assistant to do, and it’s all available in your browser.

Gemini AI assistant in Google Calendar in web browserImage source: Google

Considering Google has slowly been ticking off more features for Gemini on its various platforms—including planning to bring Gemini live video to Android this month—it isn’t all that surprising to see Google Calendar getting Gemini integration finally. We’ve already have integration with the AI in Docs, Sheets, and other Google Worksuite apps, so it was really only a matter of time before Calendar got the same treatment.

Tech. Entertainment. Science. Your inbox.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

As usual, you can provide details about how Gemini responds to help Google improve the service, and you even have control over deleting your recent Gemini history. It’s unclear if having Google’s Premium AI subscription and access to the “better” versions of Gemini will make the assistant work any better in Calendar—though it hasn’t ever seemed to make a massive difference in the other Workspace apps.

You can join Google Workspace Labs to get a chance at trying these kind of feature releases early, though keep in mind that the exact way they work, and their reliability may change over time as Google improves them.

Personally, I don’t mind seeing Gemini in Calendar. Even on my iPhone, I’ve used Gemini a good bit to help with minor planning for things, so being able to tell it to add new events to my calendar in my browser will be a welcome addition. Of course, I know not everyone sees the invasion of more AI features in our everyday tools as a good thing, so the usefulness of this new feature will vary greatly depending on how you feel, and how much you even use Google Calendar as a whole.

Source

Posted on

DeepSeek-R1: Budgeting challenges for on-premise deployments

Until now, IT leaders have needed to consider the cyber security risks posed by allowing users to access large language models (LLMs) like ChatGPT directly via the cloud. The alternative has been to use open source LLMs that can be hosted on-premise or accessed via a private cloud. 

The artificial intelligence (AI) model needs to run in-memory and, when using graphics processing units (GPUs) for AI acceleration, this means IT leaders need to consider the costs associated with purchasing banks of GPUs to build up enough memory to hold the entire model.

Nvidia’s high-end AI acceleration GPU, the H100, is configured with 80Gbytes of random-access memory (RAM), and its specification shows it’s rated at 350w in terms of energy use.

China’s DeepSeek has been able to demonstrate that its R1 LLM can rival US artificial intelligence without the need to resort to the latest GPU hardware. It does, however, benefit from GPU-based AI acceleration.

Nevertheless, deploying a private version of DeepSeek still requires significant hardware investment. To run the entire DeepSeek-R1 model, which has 671 billion parameters in-memory, requires 768Gbytes of memory. With Nvidia H100 GPUs, which are configured with 80GBytes of video memory card each, 10 would be required to ensure the entire DeepSeek-R1 model can run in-memory. 

IT leaders may well be able to negotiate volume discounts, but the cost of just the AI acceleration hardware to run DeepSeek is around $250,000.

Less powerful GPUs can be used, which may help to reduce this figure. But given current GPU prices, a server capable of running the complete 670 billion-parameter DeepSeek-R1 model in-memory is going to cost over $100,000.

The server could be run on public cloud infrastructure. Azure, for instance, offers access to the Nvidia H100 with 900 GBytes of memory for $27.167 per hour, which, on paper, should easily be able to run the 671 billion-parameter DeepSeek-R1 model entirely in-memory.

If this model is used every working day, and assuming a 35-hour week and four weeks a year of holidays and downtime, the annual Azure bill would be almost $46,000 a year. Again, this figure could be reduced significantly to $16.63 per hour ($23,000) per year if there is a three-year commitment.

Less powerful GPUs will clearly cost less, but it’s the memory costs that make these prohibitive. For instance, looking at current Google Cloud pricing, the Nvidia T4 GPU is priced at $0.35 per GPU per hour, and is available with up to four GPUs, giving a total of 64 Gbytes of memory for $1.40 per hour, and 12 would be needed to fit the DeepSeek-R1 671 billion-parameter model entirely-in memory, which works out at $16.80 per hour. With a three-year commitment, this figure comes down to $7.68, which works out at just under $13,000 per year.

A cheaper approach

IT leaders can reduce costs further by avoiding expensive GPUs altogether and relying entirely on general-purpose central processing units (CPUs). This setup is really only suitable when DeepSeek-R1 is used purely for AI inference.

A recent tweet from Matthew Carrigan, machine learning engineer at Hugging Face, suggests such a system could be built using two AMD Epyc server processors and 768 Gbytes of fast memory. The system he presented in a series of tweets could be put together for about $6,000.

Responding to comments on the setup, Carrigan said he is able to achieve a processing rate of six to eight tokens per second, depending on the specific processor and memory speed that is installed. It also depends on the length of the natural language query, but his tweet includes a video showing near-real-time querying of DeepSeek-R1 on the hardware he built based on the dual AMD Epyc setup and 768Gbytes of memory.

Carrigan acknowledges that GPUs will win on speed, but they are expensive. In his series of tweets, he points out that the amount of memory installed has a direct impact on performance. This is due to the way DeepSeek “remembers” previous queries to get to answers quicker. The technique is called Key-Value (KV) caching.

“In testing with longer contexts, the KV cache is actually bigger than I realised,” he said, and suggested that the hardware configuration would require 1TBytes of memory instead of 76Gbytes, when huge volumes of text or context is pasted into the DeepSeek-R1 query prompt.

Buying a prebuilt Dell, HPE or Lenovo server to do something similar is likely to be considerably more expensive, depending on the processor and memory configurations specified.

A different way to address memory costs

Among the approaches that can be taken to reduce memory costs is using multiple tiers of memory controlled by a custom chip. This is what California startup SambaNova has done using its SN40L Reconfigurable Dataflow Unit (RDU) and a proprietary dataflow architecture for three-tier memory.

“DeepSeek-R1 is one of the most advanced frontier AI models available, but its full potential has been limited by the inefficiency of GPUs,” said Rodrigo Liang, CEO of SambaNova.

The company, which was founded in 2017 by a group of ex-Sun/Oracle engineers and has an ongoing collaboration with Stanford University’s electrical engineering department, claims the RDU chip collapses the hardware requirements to run DeepSeek-R1 efficiently from 40 racks down to one rack configured with 16 RDUs.

Earlier this month at the Leap 2025 conference in Riyadh, SambaNova signed a deal to introduce Saudi Arabia’s first sovereign LLM-as-a-service cloud platform. Saud AlSheraihi, vice-president of digital solutions at Saudi Telecom Company, said: “This collaboration with SambaNova marks a significant milestone in our journey to empower Saudi enterprises with sovereign AI capabilities. By offering a secure and scalable inferencing-as-a-service platform, we are enabling organisations to unlock the full potential of their data while maintaining complete control.”

This deal with the Saudi Arabian telco provider illustrates how governments need to consider all options when building out sovereign AI capacity. DeepSeek demonstrated that there are alternative approaches that can be just as effective as the tried and tested method of deploying immense and costly arrays of GPUs.

And while it does indeed run better, when GPU-accelerated AI hardware is present, what SambaNova is claiming is that there is also an alternative way to achieve the same performance for running models like DeepSeek-R1 on-premise, in-memory, without the costs of having to acquire GPUs fitted with the memory the model needs.

Source

Posted on

Balancing act: Managing business needs alongside digital transformation and innovation

When building a startup, there is a real balancing act between managing expectations, educating on what’s possible, and identifying the true cost of innovation. CTOs are challenged not only to build functional technology platforms quickly, but to do so as cost effectively as possible.

Startups are often not profitable therefore don’t have a lot of cash to burn, meaning the CTO has to deliver technology solutions to solve their business goals on a limited budget.

Let’s look at a legacy industry like commercial insurance – it’s been undergoing a transformation in recent years. The industry is data and human heavy and is heavily regulated which is why it’s ripe for innovation. It is also playing catch-up to address the needs of many consumers who want a seamless user experience and businesses that want a modern experience – faster, streamlined, digitised, and so on – when dealing with insurance providers. This is particularly true of the on-demand economy.

Leveraging technology

The on-demand economy is characterised by the likes of Taskrabbit, Doordash, Uber, Deliveroo and Amazon Flex. But it’s the likes of hard working on-demand taxi and delivery drivers who are calling for flexible insurance that caters to their very specific needs which enables them to buy comprehensive coverage for when they’re driving, and to switch it off when they’re not.

However, many insurtechs have not adequately met these needs despite their ability to leverage technology more nimbly and effectively than traditional players. The business of insurance is complicated and innovation cannot be retrofitted with existing tech, which is why it’s vital to have a deep understanding of what the requirements are between the customer, the insurance partners and platforms like Uber and Amazon, for instance.

Transforming the on-demand insurance industry is a symbiotic relationship between the customer, the insurance provider and the platform. Although it can deliver real results for all, it also comes with its share of unique challenges.

Loss ratio – how much an insurance company spends on claims compared to the premiums it receives – is a key indicator of profitability. When insurtech startups focus too much on showy AI-driven gimmicks such as automatic claims payments within seconds, loss ratios suffer – and crucial insurance industry partners back away quickly. In the world of insurance, “innovation at all costs” simply doesn’t work.

But technology cannot simply operate as a cost centre. By working in partnership with the rest of the business, startup CTOs and their teams need to focus on building an ongoing technology foundation to drive innovation within legacy industry structures and processes, driving business growth as well as consistent results for customers and partners.

Tech as augmentor – not replacement

Many of the challenges CTOs face aren’t necessarily about technology, but the change of mindset required when implementing tech solutions. Until very recently, insurance was an industry dominated by traditional players, governed by outdated systems and processes. While this is changing, there are still areas where bridges must be built between the promise of what technology can deliver and a certain “this is how it’s always been done” mindset.

For example, we know that insurance, like many industries, is ripe for reinvention through smart uses of AI – as long as it is implemented in the most appropriate areas of the business, and used as an augmented assistant rather than a replacement for specialist expertise.

Chris Gray headshot

“Many of the challenges CTOs face aren’t necessarily about technology, but the change of mindset required when implementing tech solutions”

Chris Gray, Inshur

At Inshur, working in combination with a team from Google Cloud, we were able to build an AI assistant for our claims team and demonstrate to management its effectiveness in helping the team prioritise work as well as speeding up administrative tasks, while providing fast and effective customer service. We’re continuing to roll out this technology internationally, as well as add further features to augment the human adjusters and utilise their expertise while saving them time.

The assistant helps the team to quickly scan incoming documents, including email, physical letters, attachments or transcribed phone calls; infer the data, including who is the sender and the intention of the communication; identify important and useful information such as vehicle registration and claimant name; identify the priority and urgency of the claim; assign it to the right team; and summarise the data into a standard format for ease of use. By automatically accepting feedback, retraining, and learning from past actions, the assistant also helps guide handlers with proposed next steps, helping to train new claims handlers.

The AI-based tools we built to support our claims teams have enabled us to see patterns that are also a good fit for other departments within the business. So much so, that we see potential for the commoditisation of these approaches to a wider set of solutions that serves not just insurance, but any business.

Build or buy?

Another question a lot of startup CTOs are asked is whether to build or buy. Building tech solutions from scratch can carry significant risk, especially given the resource investment typically required. But when every business in a given market is using the same platforms – usually with significant tweaks and workarounds to fit their specific needs – then nobody can truly win the innovation race.

First-movers must always be willing to build when necessary, and to buy when prudent.

For example, we decided that we needed to invest in developing our own solutions to problems that could not be adequately solved by off-the-shelf products. One such product is our Pay-as-you-flex wallet for Amazon Flex. While traditional insurance has historically covered drivers at all times, including when they’re not driving, we knew that technology held the key to delivering a new insurance product that would enable delivery drivers to pay only for the cover they needed, when they needed it.

As the first-of-its-kind to enter the market, we knew that we’d need to build it from scratch.

It’s only since we built our proprietary platform to manage business-critical processes including policy administration, claims management and billing that similar products have entered the market. By building a platform that’s fully tailored to the specific needs of the market we serve, we’ve paved the way for other insurers to do the same for their customers and partners.

However, the startup CTO must also take the lead in conversations where buying makes most sense, securing buy-in from other senior stakeholders and identifying the most appropriate vendors to partner with. Often, particularly in a high-growth startup where cost and return on investment are key considerations, this will involve a detailed assessment of risk for all available scenarios.

In Inshur’s case, we’re working with Google Cloud to implement several of its AI products to drive efficiencies and ensure that customers are treated fairly – which is both a regulatory and moral imperative in the insurance industry.

We know that our customers drive for a living, which means they often need to call us via their hands-free mobile technology while driving in between journeys, rather than emailing or speaking to a text-based chatbot. 

When we identified that a significant proportion of the calls coming into our customer service team could be quickly and effectively answered by an AI-driven solution, we implemented a “smart virtual agent” to handle more straightforward queries, enabling the team to focus more on serving customers with specific or detailed questions.

Bridging the gap

Because of the crucial role technology such as AI will play in the coming years, CTOs will need to ensure they are consistently developing deep understanding and expertise, not just in the latest technology innovations but also how they can be implemented to drive business strategy and growth.

Crucially, this will include taking a leadership role in helping to educate stakeholders across the business on the best use cases for AI tools and other solutions, building understanding at every level around what the technology can and can’t help with, and putting clear structure and process around innovation.

This ability to bridge the gap between the business and technology is already becoming a crucial indicator of future success.

Chris Gray is chief technology officer at vehicle insurance provider Inshur.

Source

Posted on

Google’s new AI solves superbug problems in two days

Superbugs are what we routinely call bacteria that grow resistant to antibiotics. The name isn’t exaggerated, as these acquired defenses allow the superbugs to survive and potentially provoke more havoc. 

Scientists like Professor José R Penadés’s team at Imperial College London have been trying to figure out how superbugs get their problems in the hope of a solution for years, working on a theory they did not share with anyone until the arrival of a new piece of AI.

Google’s newly released co-scientist AI (based on Gemini 2.0) needed only 48 hours to reach the same conclusion as the scientists after a decade of research. The AI also came up with additional reasons why a bug might get superpowers, including a new concept scientists are now studying.

Penadés detailed his experience learning that the AI had solved the problem in two days on BBC Radio Four’s Today

Tech. Entertainment. Science. Your inbox.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

“I was shopping with somebody, I said, ‘please leave me alone for an hour, I need to digest this thing,’” he said. Penadés emailed Google to inquire whether Google had access to his data and whether the AI had seen his team’s work to reach that conclusion so fast.

That would be the first thing that would cross anyone’s mind when seeing AI guess exactly the premise you were working on. But Google denied having access to the researc. The co-scientist AI came up with the answer on its own.

“It’s not just that the top hypothesis they provide was the right one,” the researcher said. “It’s that they provide another four, and all of them made sense. And for one of them, we never thought about it, and we’re now working on that.”

It’s unclear what the AI suggested that the researchers didn’t figure out on their own, but the experience shows how AI can help with specific tasks, including research. While the AI confirmed researchers’ findings, the speed of such an advanced computer program could save them years of research. They could spend the time working on solutions for the problems they’re studying. Conversely, that’s where AI could also help.

So, what was Google’s co-scientist AI doing that was so brilliant? The program proposed that bacteria that become superbugs can form a tail from different viruses, which lets them continue spreading across species.

The researchers have been working on this assumption for a decade. However, Penadés’s team did not share their ongoing work with anyone outside the organization and did not publish any findings.

AI will not always find the right solution for your needs, and it will continue to imagine things that are not real. But fine-tuned AI like Google’s co-scientist might become a tool other researchers use to advance their work quickly.

Penadés told the BBC that he understands the worries about AI taking over jobs, but what he saw with Google’s AI doesn’t fall into the same category. “When you think about it it’s more that you have an extremely powerful tool,” he said.

“I feel this will change science, definitely,” Penadés added.
“I’m in front of something that is spectacular, and I’m very happy to be part of that. It’s like you have the opportunity to be playing a big match – I feel like I’m finally playing a Champions League match with this thing.”

His team also thinks the new Google AI will be useful in the future. It’ll be interesting to see whether the researchers can actually find ways to kill the superbugs with existing or new antibiotics now that they can rely on AI to help.

Source

Posted on

Warning over privacy of encrypted messages as Russia targets Signal Messenger

Russia-backed hacking groups have developed techniques to compromise encrypted messaging services, including Signal, WhatsApp and Telegram, placing journalists, politicians and activists of interest to the Russian intelligence service at potential risk.

Google Threat Intelligence Group disclosed today that Russia-backed hackers had stepped up attacks on Signal Messenger accounts to access sensitive government and military communications relating to the war in Ukraine.

Analysts predict it is only a matter of time before Russia starts deploying hacking techniques against non-military Signal users and users of other encrypted messaging services, including WhatsApp and Telegram.

Dan Black, principal analyst at Google Threat Intelligence Group, said he would be “absolutely shocked” if he did not see attacks against Signal expand beyond the war in Ukraine and to other encrypted messaging platforms. 

He said Russia was frequently a “first mover” in cyber attacks, and that it would only be a matter of time before other countries, such as Iran, China and North Korea, were using exploits to attack the encrypted messages of subjects of intelligence interest.

The warning follows disclosures that Russian intelligence created a spoof website for the Davos World Economic Forum in January 2025 to surreptitiously attempt to gain access to WhatsApp accounts used by Ukrainian government officials, diplomats and a former investigative journalist at Bellingcat.

Linked devices targeted 

Russia-backed hackers are attempting to compromise Signal’s “linked devices” capability, which allows Signal users to link their messaging account to multiple devices, including phones and laptops, using a quick response (QR) code.

Google threat analysts report that Russia-linked threat actors have developed malicious QR codes that, when scanned, will give the threat actor real-time access to the victim’s messages without having to compromise the victim’s phone or computer.

In one case, according to Black, a compromised Signal account led Russia to launch an artillery strike against a Ukrainian army brigade, resulting in a number of casualties.

Russia-backed groups have been observed disguising malicious codes as invites for Signal group discussions or as legitimate device pairing instructions from the Signal website. 

In some targeted spear phishing attacks, Russia-linked hackers have also embedded malicious QR codes in phishing websites designed to mimic specialist applications used by victims of the attack.

Russia-compromised Signal found on battlefield phones

The Russia-linked Sandworm group, also known as APT44, which is linked to the General Staff of the Armed Forces of the Russian Federation, has worked with Russian military forces in Ukraine to compromise Signal accounts on phones and computers captured on the battlefield.

Google’s Mandiant researchers identified a Russian language website giving instructions to Russian speakers on how to pair Signal or Telegram accounts with infrastructure controlled by APT44.

“The extrapolation is that this is being provisioned to Russian forces to be able to deploy captured devices on the battlefield and send back the communications to the GRU to be exploited,” Black told Computer Weekly.

Russia is believed to have fed the intercepted Signal communications back to a “data lake” to analyse the content of large numbers of Signal communications for battlefield intelligence.

Compromise likely to go undetected

The attacks, which are based on exploiting Signal’s device linking capability, are difficult to detect and when successful there is a high risk that compromised Signal accounts can go unnoticed for a long time.

Google has identified another cluster of Russia-backed attackers, known as UNC5792, that has used modified versions of legitimate Signal group invite pages which link the victim’s Signal account to a device controlled by the hacking group, enabling the group to read and access the target’s Signal messages.

Other Russia-linked threat actors have developed a Signal “phishing kit” designed to mimic components of the Kropyva artillery guidance software used by the Ukrainian military. The hacking group, known as UNC4221, previously used malicious web pages designed to mimic legitimate security alerts from Signal.

The group has also used a lightweight JavaScript payload, known as Pinpoint, to collect basic user information and geolocation data from web browsers.

Google has warned that the combination of access to secure messages and location data of victims are likely to be used to underpin targeted surveillance operations or to support conventional military operations in Ukraine.

Signal databases attacked on Android

Google also warned that multiple threat actors have been observed using exploits to steal Signal database files from compromised Android and Windows devices.

In 2023, the UK’s National Cyber Security Centre and the Security Service of Ukraine warned that the Sandworm hacking group had deployed Android malware, known as Infamous Chisel, to search for messaging applications, including Signal, on Android devices.

The malware is able to scan infected devices for WhatsApp messages, Discord messages, geolocation information and other data of interest to Russian intelligence. It is able to identify Signal and other messages and “package them” in unencrypted form for exfiltration.

APT44 operates a lightweight Windows batch script, known as WaveSign, to periodically query signal messages from a victim’s Signal database and to exfiltrate the most recent messages.

Russian threat actor Turla, which has been attributed by the US and the UK to the Russian Federal Security Service, has used a lightweight Powershell script to exfiltrate Signal desktop messages.

And in Belarus, an ally of Russia, a hacking group designated as UNC1151 has used a command-line utility, known as Robocopy, to line up the contents of file directories used by Signal desktop to store messages and attachments for later exfiltration.

Encrypted messaging services under threat

Google has warned that attempts by multiple threat actors to target Signal serve as a warning for the growing threat to secure messaging services and that attacks are certain to intensify in the near-term future.

“There appears to be a clear and growing demand for offensive cyber capabilities that can be used to monitor the sensitive communications of individuals who rely on secure messaging applications to safeguard their online activity,” it said.

Attacks exploit ‘legitimate function’

Users of encrypted communications are not just at risk from phishing and malware attacks, but also from the capability of threat actors to secure access to a target’s device – for example, by breaking the password.

Black said it was insidious that Russian attackers were using a “legitimate function” in Signal to gain access to confidential communications, rather than compromising victims’ phones or breaking the encryption of the app.

“A lot of audiences who are using signal to have sensitive communications need to think about the risk of pairing their device to a second device,” he said.

Signal and Telegram targeted

Russia-aligned groups have also targeted other widely used messaging platforms, including Signal and Telegram.

A Russian hacking group linked to Russia’s FSB intelligence service, known variously as Coldriver, Seaborgium, Callisto and Star Blizzard, shifted its tactics in late 2024 to launch social engineering attacks on people using WhatsApp encrypted messaging.

The group targets MPs, people involved in governments or diplomacy, research and defence policy, and organisations or individuals supporting Ukraine.

As exposed by Computer Weekly in 2022, Star Blizzard previously hacked, compromised and leaked emails and documents belonging to a former head of MI6, alongside other members of a secretive right-wing network devoted to campaigning for an extreme hard Brexit.

Scottish National Party MP Stewart McDonald was another victim of the group. Left wing Freelance journalist Paul Mason, who has frequently criticised Putin’s war against Ukraine, was also targeted by the group and his emails leaked to the Greyzone, a pro-Russian publication in the US.

Academics from the universities of Bristol, Cambridge and Edinburgh, including the late Ross Anderson, professor of security engineering, first published researched in 2023 warning that the desktop versions of Signal and WhatsApp could be compromised if accessed by a border guard or an intimate partner, enabling them to read all future messages.

Signal hardens security

Signal has taken steps to improve the security of its pairing function to alert users to possible attempts to gain access to their accounts through social engineering tactics, following Google’s findings.

Josh Lund, senior technologist at Signal, said the organisation had introduced a number of updates to mitigate potential social engineering and phishing attacks before it was approached by Google.

“Google Threat Intelligence Group provided us with additional information, and we introduced further improvements based on their feedback. We are grateful for their help and close collaboration,” he told Computer Weekly.

Signal has since made further improvements, including overhauling the interface to provide additional alerts when someone links a new device. 

It has also introduced additional authentication steps to prevent anyone other than the owner of the primary device from adding a new linked device. When any new device is linked to a Signal account, the primary device will automatically receive a notification, allowing users to quickly review and remove any unknown or unwanted linked devices.

Google Threat Intelligence Group’s Black advised people the Signal app to think carefully before accepting links to group chats.

“If it’s a contact you know, just create the group yourself directly. Don’t use external links to do things that you can do directly using the messaging application’s features,” he said.

Read more about Russian attacks on Signal on Dan Black’s blog post.

Source