Posted on

Does the Nvidia RTX 5090 have a cable melting problem? It’s complicated

  • A small number of reports of RTX 5090 power cables overheating and melting have been confirmed
  • This follows widespread reports of similar issues with the previous RTX 4090
  • However, it’s possible that third-party cables could be to blame this time around

Remember ‘cablegate’? Back in late 2022, users started to report that the power connectors of their Nvidia RTX 4090 graphics cards were overheating and essentially melting into unusable hunks of plastic – and now, according to some buyers, the same issue could be plaguing the newly-released RTX 5090.

Now, I covered the cablegate fiasco when the story was at its peak, and at the time, I was willing to assign at least some portion of the blame to Nvidia, as the PCIe Special Interest Group (PCI-SIG) had previously published a report warning of ‘thermal variance’ risks regarding the 12VHPWR adapter used for the RTX 4090. This time around, though, I’m really not so sure Nvidia is at fault.

For starters, the issues appear far less widespread than with the RTX 4090; while there were literally dozens of reports back in 2022 (which continued well into 2024), we’ve only seen two isolated confirmed cases of cable-melting with regard to the RTX 5090. The first came from a Reddit user, while the second was reported by the Spanish YouTube channel Toro Tocho Reviews. Both reported the same issue: the power cable overheated and melted at both ends, something we didn’t see in the majority of RTX 4090 connector failures.

Secondly, the first of these cases was confirmed to have involved a third-party power cable from PC-modding supplier MODDIY, introducing a new potential point of failure. Nvidia has now moved over to the 12V-2×6 connection standard for more stable power delivery and more secure pin connections, and although MODDIY claims its cables support the new standard, the Reddit user stated that they’d been using this cable for two years with an RTX 4090. Backward compatibility with third-party 12VHPWR cables is likely to continue to be an issue for Nvidia’s GPUs – notably, MODDIY now has a page on its website advising buyers with RTX 5000-series GPUs to purchase new-for-2025 12V-2×6 cables.

So is there really a problem?

In other words, at least one of these cable-melting cases appears to have been caused by user error: the 12VHPWR cable that melted, despite being physically compatible with the RTX 5090, was presumably unable to handle the power delivery taking place. Nvidia’s latest flagship GPU is a hungry girl, after all, with an obscene 575W TDP.

PC modders are gonna mod, of course, but given the known issues with the previous-gen card’s power connector, I’d personally be very reluctant to use anything but the cables supplied in the box at this point. A Reddit megathread on the topic has been created to compile additional cases, and there’s a fair amount of debate in the comments as to whether Nvidia is to blame or if users should be taking more care to avoid third-party cables – even if they claim to be compatible.

Naturally, I reached out to my contact at Nvidia to ask for a quote, but Team Green declined to comment – not even a ‘we’re investigating’, instead pointing me to MODDIY’s page warning about using older 12VHPWR cables. It seems Nvidia feels more confident this time around, further reinforcing the idea that the cases we’ve seen so far were caused not by the GPUs themselves but rather third-party hardware.

{ window.reliablePageLoad.then(() => { var componentContainer = document.querySelector(“#slice-container-newsletterForm-articleInbodyContent-ynv8HU7B3GgGxvFc4RG8a7”); if (componentContainer) { var data = {“layout”:”inbodyContent”,”header”:”Get daily insight, inspiration and deals in your inbox”,”tagline”:”Sign up for breaking news, reviews, opinion, top tech deals, and more.”,”formFooterText”:”By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.”,”successMessage”:{“body”:”Thank you for signing up. You will receive a confirmation email shortly.”},”failureMessage”:”There was a problem. Please refresh the page and try again.”,”method”:”POST”,”inputs”:[{“type”:”hidden”,”name”:”NAME”},{“type”:”email”,”name”:”MAIL”,”placeholder”:”Your Email Address”,”required”:true},{“type”:”hidden”,”name”:”NEWSLETTER_CODE”,”value”:”XTR-D”},{“type”:”hidden”,”name”:”LANG”,”value”:”EN”},{“type”:”hidden”,”name”:”SOURCE”,”value”:”60″},{“type”:”hidden”,”name”:”COUNTRY”},{“type”:”checkbox”,”name”:”CONTACT_OTHER_BRANDS”,”label”:{“text”:”Contact me with news and offers from other Future brands”}},{“type”:”checkbox”,”name”:”CONTACT_PARTNERS”,”label”:{“text”:”Receive email from us on behalf of our trusted partners or sponsors”}},{“type”:”submit”,”value”:”Sign me up”,”required”:true}],”endpoint”:”https://newsletter-subscribe.futureplc.com/v2/submission/submit”,”analytics”:[{“analyticsType”:”widgetViewed”}],”ariaLabels”:{}}; var triggerHydrate = function() { window.sliceComponents.newsletterForm.hydrate(data, componentContainer); } if (window.lazyObserveElement) { window.lazyObserveElement(componentContainer, triggerHydrate); } else { triggerHydrate(); } } }).catch(err => console.error(‘%c FTE ‘,’background: #9306F9; color: #ffffff’,’Hydration Script has failed for newsletterForm-articleInbodyContent-ynv8HU7B3GgGxvFc4RG8a7 Slice’, err)); }).catch(err => console.error(‘%c FTE ‘,’background: #9306F9; color: #ffffff’,’Externals script failed to load’, err)); ]]>

Sign up for breaking news, reviews, opinion, top tech deals, and more.

It is still possible that we’re only at the beginning of a tidal wave of similar reports – given the extremely limited availability of the RTX 5090 at launch, we might be yet to see the full extent of the issue as only a small number of users have managed to get their hands on the GPU.

Still, we shouldn’t jump to conclusions. Two cases (with a few more unconfirmed) aren’t exactly the cavalcade of issues we saw after the RTX 4090 launch, so there’s every chance these problems won’t be so widespread. If you were lucky enough to snag one of Nvidia’s new flagship GPUs, my only advice is this: stick with the supplied cables for now! If you’ve got thoughts on this, please feel free to tell me what a genius/idiot I am in our shiny new comments section below. Frankly, I’d love to chat with someone who actually managed to buy one of these cards…

You might also like…

Source

Posted on

Nvidia announces RTX 5070 Ti GPU is out on February 20, but RTX 5070 is delayed to March 5 – and I’m far from surprised

  • Nvidia has revealed that the RTX 5070 Ti goes on sale on February 20
  • It’ll be followed by the RTX 5070 on March 5, despite Nvidia originally saying this GPU would arrive in February as well as the Ti
  • The RTX 5070 Ti launch date, and delay to early March for the RTX 5070, were predicted by the rumor mill

Nvidia has confirmed the launch dates of its RTX 5070 graphics cards, with the RTX 5070 Ti arriving as planned in February, but with the vanilla RTX 5070 having been pushed out a bit to March, as rumors had already insisted was the case.

Nvidia updated its official web page for the RTX 5070 models with the exact dates, which are February 20 for the RTX 5070 Ti, and March 5 for the RTX 5070.

Team Green also posted on X to reveal the information about the RTX 5070 Ti, and how the graphics card is coming from third-party board makers and PC builders – remember, this variant won’t have a Founders Edition from Nvidia itself.

The RTX 5070, on the other hand, will have a Founders Edition, but you’ll be waiting a couple more weeks to attempt to order that, or to try and grab a third-party graphics card.

Try being the operative word here, as it remains to be seen what kind of stock levels that the RTX 5070 models will launch with.

An angry PC Gamer sat at their desk looking unhappy

(Image credit: ShutterStock)

Analysis: The rumors were right

It doesn’t bode particularly well that back when all the new Blackwell graphics cards were revealed at CES 2025, Nvidia said that both of these RTX 5070 variants were due to arrive in February at some point (without giving any specific dates). Pretty soon after, there were early rumors suggesting the RTX 5070 non-Ti version would be delayed to early March, and that’s exactly what has happened.

Why the (slight) delay, then? Could this be about running interference with AMD’s RX 9070 launch somehow, which is also in March? I doubt that, and a more likely turn of events is to be found in considering the most recent development with chatter from the grapevine. Namely, the tale spun earlier today about stock of the RTX 5070 possibly being seriously thin on the ground – potentially a similar situation to the one we’ve seen with the RTX 5090 and 5080.

{ window.reliablePageLoad.then(() => { var componentContainer = document.querySelector(“#slice-container-newsletterForm-articleInbodyContent-3ekhJVofdVGBk9xt6sGH7A”); if (componentContainer) { var data = {“layout”:”inbodyContent”,”header”:”Get daily insight, inspiration and deals in your inbox”,”tagline”:”Sign up for breaking news, reviews, opinion, top tech deals, and more.”,”formFooterText”:”By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.”,”successMessage”:{“body”:”Thank you for signing up. You will receive a confirmation email shortly.”},”failureMessage”:”There was a problem. Please refresh the page and try again.”,”method”:”POST”,”inputs”:[{“type”:”hidden”,”name”:”NAME”},{“type”:”email”,”name”:”MAIL”,”placeholder”:”Your Email Address”,”required”:true},{“type”:”hidden”,”name”:”NEWSLETTER_CODE”,”value”:”XTR-D”},{“type”:”hidden”,”name”:”LANG”,”value”:”EN”},{“type”:”hidden”,”name”:”SOURCE”,”value”:”60″},{“type”:”hidden”,”name”:”COUNTRY”},{“type”:”checkbox”,”name”:”CONTACT_OTHER_BRANDS”,”label”:{“text”:”Contact me with news and offers from other Future brands”}},{“type”:”checkbox”,”name”:”CONTACT_PARTNERS”,”label”:{“text”:”Receive email from us on behalf of our trusted partners or sponsors”}},{“type”:”submit”,”value”:”Sign me up”,”required”:true}],”endpoint”:”https://newsletter-subscribe.futureplc.com/v2/submission/submit”,”analytics”:[{“analyticsType”:”widgetViewed”}],”ariaLabels”:{}}; var triggerHydrate = function() { window.sliceComponents.newsletterForm.hydrate(data, componentContainer); } if (window.lazyObserveElement) { window.lazyObserveElement(componentContainer, triggerHydrate); } else { triggerHydrate(); } } }).catch(err => console.error(‘%c FTE ‘,’background: #9306F9; color: #ffffff’,’Hydration Script has failed for newsletterForm-articleInbodyContent-3ekhJVofdVGBk9xt6sGH7A Slice’, err)); }).catch(err => console.error(‘%c FTE ‘,’background: #9306F9; color: #ffffff’,’Externals script failed to load’, err)); ]]>

Sign up for breaking news, reviews, opinion, top tech deals, and more.

Now, that’s just a rumor, and I’m not saying it’s right, but it certainly makes some sense when you put everything together in the cold light of day. Given Nvidia’s stock woes have been pretty disastrous so far, if the RTX 5070 were to arrive in healthy quantities, it’d be a surprise. It’s certainly a believable theory that the vanilla 5070 needed to be pushed back a bit to ensure something like a half-decent amount of these graphics cards end up on shelves.

The other problem with the RTX 5070 is that given its more affordable price tag, this GPU is going to be in far greater demand than any of the Blackwell graphics cards we’ve seen so far – making a tighter supply a potentially much worse scenario.

Furthermore, bear in mind that the same rumor peddler who was right about the RTX 5070 delay, also suggested the RTX 5060 has been pushed back as well (from March to April).

Naturally, stay skeptical on that particular RTX 5060 nugget, but the other widely spread rumor, the one about the RTX 5070 Ti launching on February 20, has also turned out to be correct. In short, the rumor mill appears to have nailed what’s happening with incoming Blackwell GPUs pretty well thus far, so at this point, I’m definitely not betting against it.

You might also like…

Source

Posted on

Cyber Monitoring Centre develops hurricane scale to count cost of cyber attacks

The CrowdStrike incident in 2024 hit the UK like a hurricane. As it swept across the country, it brought flights to a standstill, forced hospitals to cancel operations, and brought down the computer systems and websites of hundreds of businesses.

Since the early 1970s, it has been possible to predict the damage likely to be caused by a hurricane using a five-point wind scale.

Category one hurricanes may damage roofs or break branches on trees, while at the other end of the scale, a category five hurricane could leave areas uninhabitable for months.

There’s no such way to categorise the destructive impact of cyber events like the CrowdStrike update, which brought down Windows computers worldwide in July 2024 – but that is set to change, as an initiative gets underway this year to assess the damage caused by major cyber attacks on a five-point scale.

The Cyber Monitoring Centre (CMC), the first organisation of its type, has been set up by the insurance industry as an arms-length organisation to assess the impact of serious cyber attacks that have systemic implications for the UK’s infrastructure and services. It aims to make it easier for businesses to buy cyber insurance cover, and know exactly what will be covered and what won’t.

There are many ways to assess the impact of a cyber event. It could be measured in loss of life through cancelled hospital operations, the disruption caused by leaks of people’s personally identifiable information on the internet, or the strategic implications of the loss of classified government information to a hostile nation-state.

The CMC will focus on just one: the economic impact. The centre has appointed a technical committee of eminent experts to assign cyber events to a five-point scale ranging from small-scale disruptions impacting hundreds of people to catastrophic attacks affecting hundreds of thousands. Damage impacts range from less than £100m for category one events to more than £5bn for category five.

The centre plans to monitor press reports and reports from business organisations to identify significant cyber attacks with multiple victims. It has partnerships with data providers to provide statistics on cancelled flights and disruption to datacentres, and works with the NHS to gather data on cancelled operations and hospital procedures. It also has access to advice from legal experts and cyber security specialists that respond to incidents, to help it build financial models of each significant cyber event. The models are reviewed and stress-tested. The final say goes to CMC’s technical committee.

The centre aims to produce an impact report within 30 days of the cyber event that will focus on immediate financial losses. It will not take into account longer-term losses caused by, for example, the risk of litigation, or other delayed effects.

What counts as a cyber war and who decides?

The aim of the CMC is to make it easier for companies to buy cyber insurance and know what magnitude of cyber event on the five-point scale they can expect to be covered for, said Ed Lewis, a director and founder of the centre, and CEO of risk advisory service CyXcel.

The insurance industry has long struggled with how to insure cyber risks. Back in 2022, Lloyds of London issued a bulletin mandating the exclusion of “cyber war incidents” from cyber insurance cover. But who would decide whether a cyber attack was an act of warfare by a hostile state? Government or insurers?

Add to that the complex exclusion clauses developed by the London market for cyber insurance, and it was a “lawyer’s dream”, said Lewis.

It became clear that what mattered most was not which country was responsible for an act of cyber warfare, but the scale and severity of an attack. If a cyber attack had the digital fingerprints to show that it was directed against multiple targets, it had the hallmarks of a “systemic attack”.

Some insurers, particularly those that insure multiple small and medium-sized businesses, do not cover systemic risks. That is to avoid large losses if multiple clients are hit by the same catastrophic incident. However, businesses can obtain insurance cover to protect against systemic risks from other specialist insurers.

During the summer of 2022, Lewis went with a team of lawyers from his firm, Weightmans, working with insurer CFC, to France for six weeks to hammer out a solution. They came up with the idea of creating a company limited by guarantee to act as an independent centre of expertise on systemic cyber attacks.

The team spent the first half of 2023 developing a methodology to assess the financial impact of cyber attacks on a five-point, hurricane-inspired scale, and in October that year incorporated CMC as a company limited by guarantee.

The most talked-about cyber attacks are not the most damaging 

The centre reviewed three cyber attacks in a trial run in 2024, and the results were surprising. Some of the most talked-about cyber attacks were not necessarily the most damaging to the UK economy.

Take the attack on the file transfer service, MoveIT, in May 2023. It affected more than 2,000 organisations and exposed the personal data of around 64 million people. 

Although it generated headlines around the world and captivated the attention of the cyber security community, the economic impact of the attack on MoveIT on the UK was as “close to negligible” as it is possible to reach on the CMC’s “hurricane” scale.

In June 2024, another ransomware group struck pathology laboratory Synnovis, which processes blood tests for NHS organisations across London. The attack led to major disruptions for GP surgeries and NHS trusts, leading to delays in medical procedures, cancelled appointments and shortages of blood stocks.

Despite attracting mass interest, CMC judged the economic impact as relatively low, at between £100m and £1bn, with less than 0.1% of the population affected. That won it a rating of category two on the five-point scale. 

The failure of an update to CrowdStrike’s security software in July 2024 caused worldwide disruption to Windows computers, but after an initial burst of press coverage, it failed to capture the public’s continued interest. However, CMC’s experts rated CrowdStrike as a category three incident – significantly more impactful than MoveIT and Synnovis.

How the Cyber Monitoring Centre rated three high-profile cyber events

The need for trust and independence

The CMC’s assessments may not be infallible, but they come with a clear methodology and use data to inform the technical committee’s decisions, all of which will be published and open to public scrutiny. 

The idea is that the centre will act very much like an independent arbitrator. Companies offering insurance and those buying insurance will be able to agree to be bound by its decision in any dispute over insurance cover. 

That means the centre will need to be seen as completely independent of the insurance industry and government, and that it will need to build a reputation for trusted decisions if it is to be successful. 

The centre’s current plans are to raise funding through membership fees, with the organisation hoping to attract members from a wide range of industries, such as professional services, manufacturing, retail and insurers. Lewis stressed, however, that insurers and government will have no influence over the CMC’s assessments. 

“What we are very clear on is that the work of the technical committee has to be independent of government and independent of insurers,” he said. “They have to be as far as practically possible, beyond the potential for impeachment.” 

CMC could impact government policy

The work of the CMC is likely to influence the direction of government policy over cyber risks. Many hope it will help to shift the balance of regulation from policing data leaks to policing cyber failures that result in the loss of essential services.

CMC chair Ciaran Martin cited as an example an attack by the Conti ransomware group on the Irish health service, which disrupted healthcare for months in 2021.

When the Irish state refused to immediately pay the ransom, the Conti crime group stepped up the pressure by releasing medical data on the internet. It was only at that point that Ireland’s Health Service Executive was obliged to notify regulators about the incident. 

“It’s such a stark illustration of the point that a whole national healthcare system, including cancer surgeries had to stop, and that’s not a breach of obligations, but the loss of a small amount of medical data [was considered a breach],” he told Computer Weekly.

That could change in the UK if the Cyber Security and Resilience Bill passes through parliament as expected. It introduces obligations for organisations to maintain critical services, and could lead to mandatory reporting of ransomware attacks.

“I’m not saying, ‘Let’s repeal data regulation and let’s impose sweeping service obligations on small hairdressing salons’, but I’m saying, ‘Let’s think about it carefully’,” said Martin.

If you give a victim the choice between two bad situations – one is the loss of critical health services and the other is the loss of their personal data – most people would opt for losing personal data rather than losing access to medical care, he added. 

Lewis concurs. “There seems to be a disproportionate focus on cyber incidents that also involve a data breach,” he said. “I think it’s probably fair to say there’s been quite a bit of criticism of the Information Commissioner’s Office and how those powers have been used over recent times.”

Need to tackle ‘victim stigma’

He hopes that the CMC can remove what he calls “victim stigma”, where fear of bad publicity or litigation can lead organisations hit by cyber attacks to opt for secrecy rather than openness.

There are signs that this is happening already. The British Library, which faced major disruption after an attack by the Rhysida ransomware gang, published a comprehensive lessons-learned report, which was widely applauded in the cyber security community.

The Harris Federation, a network of schools in London and the South East that lost email and telephone access after a ransomware attack in 2021, has talked about its experience in a series of podcasts to help others improve their own cyber resilience.

For Martin, the CMC’s primary aim is to deliver a better functioning insurance market and better provision for companies seeking to insure against cyber attacks.

He would like to see the CMC gain credibility over time as a source of factual information for academic, government and industry papers.

And if the CMC is doing its job, he said, the media will be able to get a better handle on which cyber incidents are serious and which are likely to have a minor economic impact.

Source

Posted on

iOS 18.4: New features, release date, AI updates, more

After waiting almost a month, Apple finally started beta testing iOS 18.4. This is one of the most important updates of the iOS 18 cycle so far. Here’s everything you need to know about iOS 18.4, including all of its features, the expected release date, and a major Apple Intelligence upgrade that Apple has planned for this release.

Release Date

Unlike other software updates, Apple already teased that iOS 18.4 will be available in April. With the iPhone 16e announcement, the company said this version would launch early that month.

Apple Intelligence upgrades

iOS 18.1 Apple Intelligence on iPhone 15 Pro all-new Siri designImage source: José Adorno for BGR

With iOS 18.4 beta 1, Apple added several new Apple Intelligence features:

New languages: Apple adds Chinese, French, German, Italian, Brazilian Portuguese, Spanish, Japanese, Korean, and localized English for Singapore and India.

Tech. Entertainment. Science. Your inbox.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

Image Playground: The long-awaited Sketch style is now available alongside the Animation and Illustration options.

Genmoji: Apple tweaked the Genmoji icon on the keyboard, as it now reads “Genmoji.”

Mail Categorization: Apple added Mail Categorization to iPad users with iPadOS 18.4 beta 1.

iOS 18.4 beta 1 features

The first iOS 18.4 beta includes several new features. These are the most important:

Apple News+ Food: This update will bring a new Food section to Apple News. Subscribers can access recipes, tips for healthy eating, restaurants, and more.

Vision Pro app: With iOS 18.4, Apple Vision Pro will get its own iPhone app. It will help you download apps, visionOS content, tips, and information and even set up Guest Mode.

Apple Maps change: You can now set a Preferred Language to get directions instead of the one you use on your iPhone.

Ambient music: iOS 18.4 adds new Control Center toggles for Ambient Music, including Chill, Productivity, Sleep, and Wellbeing.

CarPlay update: Cars with bigger screens now get three rows of apps displayed.

iOS 18.4 beta 2 features

New emojis launch with iOS 18.2Image source: José Adorno for BGR

The second iOS 18.4 beta includes several new features. These are the most important:

New emoji: Apple finally added the seven emojis teased by the Unicode Consortium last year. Still, they’re not as fun as you’d expect.

Visual Intelligence: Apple added the Visual Intelligence feature to the Action Button while also adding support for the iPhone 15 Pro and iPhone 16e.

Control Center: The Control Center now displays an Apple Intelligence section with three options: Talk to Siri, Type to Siri, and Visual Intelligence.

Apple Vision Pro app: If you have an Apple Vision Pro, iOS 18.4 beta 2 adds the already-announced Vision Pro app.

App Store: The latest beta lets you pause an app download from the App Store.

iPhone support: iPhone 12 and iPhone 16e users can now join the iOS 18.4 beta testing.

Device compatibility

Image source: José Adorno for BGR

iOS 18.3 is compatible with the following devices:

  • iPhone XR, XS, and XS Max
  • iPhone 11
  • iPhone 11 Pro and 11 Pro Max
  • iPhone SE (2nd gen)
  • iPhone 12 mini and iPhone 12
  • iPhone 12 Pro and iPhone 12 Pro Max
  • iPhone 13 mini and iPhone 13
  • iPhone 13 Pro and iPhone 13 Pro Max
  • iPhone SE (3rd gen)
  • iPhone 14 and iPhone 14 Plus
  • iPhone 14 Pro and iPhone 14 Pro Max
  • iPhone 15 and iPhone 15 Plus
  • iPhone 15 Pro and iPhone 15 Pro Max
  • iPhone 16 and iPhone 16 Plus
  • iPhone 16 Pro and iPhone 16 Pro Max
  • iPhone 16e

Source

Posted on

AMD confirms big reveal for RX 9070 GPUs on February 28, on-sale date is early March – so it looks like a head-to-head clash with Nvidia’s RTX 5070

  • AMD has confirmed its full RDNA 4 launch event for February 28
  • These initial RX 9070 models will go on sale in early March
  • That sets up the RX 9070 for a showdown against Nvidia’s RTX 5070 which hits shelves on March 5

AMD has revealed that its RDNA 4 graphics cards will get a full launch event – as opposed to the fleeting announcements made at CES 2025 – in two weeks, ahead of the March release for these GPUs.

David McAfee, who is AMD’s VP and GM of Ryzen and Radeon, let us know the date and time to mark in our calendars is February 28 at 8am EST (5am PST, 1pm UK time) via a post on X.

The AMD exec also said that the long-awaited RX 9070 models will hit shelves in early March.

When the RX 9070 XT and plain RX 9070 were announced back at CES 2025 last month, the broad expectation was that they’d arrive earlier in the first quarter, rather than later.

That hope had cold water poured over it when AMD confirmed these RDNA 4 graphics cards were delayed to March, and McAfee took to X in order to explain why. Namely to ensure that AMD’s Adrenalin graphics drivers are fully tuned and ready to go to ensure the best performance for RX 9070 GPUs out of the gate, and also to bring in more support for FSR 4 in PC games (achieving the same end, effectively).

A PC gamer's hands on a backlit keyboard

(Image credit: Shutterstock)

Analysis: One final worry…

Crucially, McAfee also mentioned that another reason for putting off the release of RDNA 4 GPUs to March was to build up stock levels of the graphics cards at retail.

Now, I’m reading the release date being set at “early March” to mean the first week of next month, and that makes sense if we turn our attention to Nvidia’s plans. We just heard from AMD’s main GPU rival that its GeForce RTX 5070 is going to be on sale come March 5.

{ window.reliablePageLoad.then(() => { var componentContainer = document.querySelector(“#slice-container-newsletterForm-articleInbodyContent-rzhZCo9PkajtMxtksfuwF8”); if (componentContainer) { var data = {“layout”:”inbodyContent”,”header”:”Get daily insight, inspiration and deals in your inbox”,”tagline”:”Sign up for breaking news, reviews, opinion, top tech deals, and more.”,”formFooterText”:”By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.”,”successMessage”:{“body”:”Thank you for signing up. You will receive a confirmation email shortly.”},”failureMessage”:”There was a problem. Please refresh the page and try again.”,”method”:”POST”,”inputs”:[{“type”:”hidden”,”name”:”NAME”},{“type”:”email”,”name”:”MAIL”,”placeholder”:”Your Email Address”,”required”:true},{“type”:”hidden”,”name”:”NEWSLETTER_CODE”,”value”:”XTR-D”},{“type”:”hidden”,”name”:”LANG”,”value”:”EN”},{“type”:”hidden”,”name”:”SOURCE”,”value”:”60″},{“type”:”hidden”,”name”:”COUNTRY”},{“type”:”checkbox”,”name”:”CONTACT_OTHER_BRANDS”,”label”:{“text”:”Contact me with news and offers from other Future brands”}},{“type”:”checkbox”,”name”:”CONTACT_PARTNERS”,”label”:{“text”:”Receive email from us on behalf of our trusted partners or sponsors”}},{“type”:”submit”,”value”:”Sign me up”,”required”:true}],”endpoint”:”https://newsletter-subscribe.futureplc.com/v2/submission/submit”,”analytics”:[{“analyticsType”:”widgetViewed”}],”ariaLabels”:{}}; var triggerHydrate = function() { window.sliceComponents.newsletterForm.hydrate(data, componentContainer); } if (window.lazyObserveElement) { window.lazyObserveElement(componentContainer, triggerHydrate); } else { triggerHydrate(); } } }).catch(err => console.error(‘%c FTE ‘,’background: #9306F9; color: #ffffff’,’Hydration Script has failed for newsletterForm-articleInbodyContent-rzhZCo9PkajtMxtksfuwF8 Slice’, err)); }).catch(err => console.error(‘%c FTE ‘,’background: #9306F9; color: #ffffff’,’Externals script failed to load’, err)); ]]>

Sign up for breaking news, reviews, opinion, top tech deals, and more.

So, it looks to me very much like there’s going to be a head-to-head clash of the mid-range GPUs, more-or-less, just after March rolls around, with AMD aiming to take Nvidia on directly here.

Team Red may well be buoyed by the general shakiness of stock levels for Nvidia’s Blackwell GPUs so far, and (believable enough) rumors that the RTX 5070 may not be that much different from the RTX 5080 and 5090 in this respect. After all, Nvidia did announce that the RTX 5070 models would both launch in February – the Ti version, and vanilla spin – and has then pushed back the non-Ti graphics card to March. That broken promise doesn’t feel like a good sign, stock-wise, to me.

In contrast, AMD seems more confident about relatively robust levels of supply for RDNA 4, and indeed we know that these GPUs have been at retailers since January. That’s thanks to leaked photos from those shops, and moreover, Team Red’s own confirmation that board-making partners had “started building initial inventory at retailers” back in January.

On top of that, there are some compellingly positive rumors about the potential performance levels we’ll see from RX 9070 models to boot, and sources elsewhere indicate AMD really is taking its time over this next-gen GPU launch to get it right.

The only worry that remains is pricing, and whether AMD’s apparent confidence with this mid-range showdown against Nvidia’s RTX 5070 models might mean the company pushes a bit higher with asking prices for the RX 9070 variants.

If you scan through the replies to McAfee’s post on X, that’s the consistent thread of worry throughout from the respondents who have less positive thoughts on RDNA 4. In short, the fear is that Nvidia’s stumbling Blackwell launch might mean AMD decides to charge more for RX 9070 GPUs – although having set up its stall as these being mid-range graphics cards, there’s surely a limit to how far Team Red might be able to push here, if this was a temptation?

Time will tell, and I remain hopeful that AMD won’t drive to any excesses here – this is a great opportunity to take the fight to Nvidia, after all. At the same time, I’m not anticipating a surprise with lower pricing either, as given how the market stands right now, that doesn’t make a lot of sense. Still, whatever happens, we need to see exactly how RX 9070 performance pans out before we can really get a perspective on pricing, anyway.

Via VideoCardz

You might also like…

Source

Posted on

South Korea plots to become home to world’s largest AI datacentre

A newly created public-private partnership looks set to oversee the creation of the world’s largest artificial intelligence (AI) datacentre in South Korea.  

Work on the datacentre, which has a projected total cost of $35bn, is set to begin later this year and is expected to create a 3GW (gigawatt) datacentre by the time of its scheduled completion in 2028.

Overseeing the project will be investment company Stock Farm Road, which has signed a memorandum of understanding (MoU) with South Korean governor Kim Young-rok of the Jeollanam-do Province that will pave the way for the site’s development.

“The facility will feature advanced cooling infrastructure, regional and international fibre bandwidth, and the ability to handle significant and sudden variations in energy load,” according to a statement. “It will serve as a foundation for next-generation AI enablement, fostering innovation and economic growth in the region and beyond.”

The statement further claims the project will lead to the creation of 10,000 jobs in a variety of disciplines spanning energy supply and storage, renewable energy production, equipment supply, and research and development.

“This is more than just a technological milestone; it’s a strategic leap forward for Korea’s global technological leadership,” said Stock Farm Road co-founder Amin Badr-El-Din.

“We are incredibly proud to partner with Stock Farm Road and the Jeollanam-do government to build this crucial infrastructure, creating an unprecedented opportunity to build the foundation for next-generation AI.”

Stock Farm Road has a background of using data analytics and AI tools to manage energy resources, and operates its own proprietary energy-to-intelligence platform, known as e2i².

The company said its expertise in this area will come into play during the datacentre’s construction, while other parts of its business will provide access to capital to fund the build.

Meanwhile, the Jeollanam-do government side of the partnership will provide support by enabling the developers to secure the permits and approvals needed to allow construction of the datacentre to start.

Stock Farm Road co-founder Brian Koo said the project could have a transformational impact on the region.

“Having witnessed first-hand the immense technological capabilities of large Asian enterprises, I recognise the potential of this project to elevate Korea and the region to a new level of technological advancement and economic prosperity,” said Koo. “This datacentre is not merely an infrastructure project, but the launchpad for a new digital industrial revolution.”

Looking ahead, Stock Farm Road said in its statement that the South Korean project marks the delivery of the first phase of its broader global strategy, whereby the company will seek to establish similar AI infrastructure partnerships across Asia, Europe and the US over the next 18 months.

The decision to site the datacentre in the Jeollanam-do province of South Korea is notable, and in keeping with the direction of travel the country’s government has been going in for some time, with regard to supporting the spread of datacentre developments outside of the central Seoul area.

“The general policy direction is for the decentralisation of datacentres away from the greater Seoul area to regional areas for the establishment of purpose-led districts,” said John Pritchard, Korea datacentre advisory team lead at real estate consultancy Cushman & Wakefield, in a late 2024 research note.

“However, this provides challenges for users, whereby latency and proximity to [the] end user are key considerations, and as such datacentres operating in the metropolitan area will become crucial enabling tools for digital groups.”

Source

Posted on

AMD’s RX 9070 GPUs could go on sale March 6, the day after Nvidia’s RTX 5070 – and I wouldn’t fret about those 900W power supply rumors

  • AMD RX 9070 GPUs are rumored to hit shelves on March 6
  • Another rumor suggests 9070 XT could need a 900W power supply
  • That’s for a top-end overclocked version of the 9070 XT, though, and there are numerous caveats to consider here

AMD’s Radeon RX 9070 graphics cards will get a full launch event on February 28, which has been confirmed by Team Red, and now chatter on the rumor mill is indicating these GPUs will be available to buy on March 6.

That purported on-sale date comes courtesy of VideoCardz’s sources, an assertion also backed up by Chinese tech site Benchlife. Even though these two rumors align, we should still take this with a great deal of caution.

That said, AMD has told us that its RX 9070 models will go on sale in early March, which I take to mean the first week, and March 6 fits that picture. Still, we’ll need confirmation officially, and presumably that’ll come at the mentioned press event for RDNA 4 GPUs in late February.

At the same time, more speculation is floating around regarding the power consumption of the RX 9070 XT, suggesting that one third-party variant has a big ask in terms of your PC’s power supply.

Tom’s Hardware noticed a post on X from Tomasz Gawroński showing a purportedly leaked image of the PowerColor RX 9070 XT Red Devil, with the packaging apparently indicating that you’ll need a 900W PSU to have this graphics card in your gaming PC.

This has raised plenty of eyebrows, as it’s 100W more than the current recommendation for the RX 7900 XTX flagship, though even the poster admitted that they weren’t sure if the image is faked.

Interestingly, Frank Azor, who is head of consumer and gaming marketing at AMD, actually replied to Gawroński, observing that there will be other RX 9070 XT models that’ll “require lower minimum power supply wattages as will there be plenty with 8 pin power connectors for worry-free upgrading.”

{ window.reliablePageLoad.then(() => { var componentContainer = document.querySelector(“#slice-container-newsletterForm-articleInbodyContent-ZZeHwLGtogX3bPTeLzTw3o”); if (componentContainer) { var data = {“layout”:”inbodyContent”,”header”:”Get daily insight, inspiration and deals in your inbox”,”tagline”:”Sign up for breaking news, reviews, opinion, top tech deals, and more.”,”formFooterText”:”By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.”,”successMessage”:{“body”:”Thank you for signing up. You will receive a confirmation email shortly.”},”failureMessage”:”There was a problem. Please refresh the page and try again.”,”method”:”POST”,”inputs”:[{“type”:”hidden”,”name”:”NAME”},{“type”:”email”,”name”:”MAIL”,”placeholder”:”Your Email Address”,”required”:true},{“type”:”hidden”,”name”:”NEWSLETTER_CODE”,”value”:”XTR-D”},{“type”:”hidden”,”name”:”LANG”,”value”:”EN”},{“type”:”hidden”,”name”:”SOURCE”,”value”:”60″},{“type”:”hidden”,”name”:”COUNTRY”},{“type”:”checkbox”,”name”:”CONTACT_OTHER_BRANDS”,”label”:{“text”:”Contact me with news and offers from other Future brands”}},{“type”:”checkbox”,”name”:”CONTACT_PARTNERS”,”label”:{“text”:”Receive email from us on behalf of our trusted partners or sponsors”}},{“type”:”submit”,”value”:”Sign me up”,”required”:true}],”endpoint”:”https://newsletter-subscribe.futureplc.com/v2/submission/submit”,”analytics”:[{“analyticsType”:”widgetViewed”}],”ariaLabels”:{}}; var triggerHydrate = function() { window.sliceComponents.newsletterForm.hydrate(data, componentContainer); } if (window.lazyObserveElement) { window.lazyObserveElement(componentContainer, triggerHydrate); } else { triggerHydrate(); } } }).catch(err => console.error(‘%c FTE ‘,’background: #9306F9; color: #ffffff’,’Hydration Script has failed for newsletterForm-articleInbodyContent-ZZeHwLGtogX3bPTeLzTw3o Slice’, err)); }).catch(err => console.error(‘%c FTE ‘,’background: #9306F9; color: #ffffff’,’Externals script failed to load’, err)); ]]>

Sign up for breaking news, reviews, opinion, top tech deals, and more.

Note that Azor didn’t confirm that the image was real, though the AMD executive didn’t call it a fake, either (but of course, he doesn’t work for PowerColor).

AMD RX 9070 GPU models

(Image credit: AMD / TechPowerup)

Analysis: Ready for the AMD vs Nvidia mid-range GPU shootout?

The launch date of March 6 for the RX 9070 models, if it turns out to be correct, is certainly an interesting choice – mainly because Nvidia only just announced March 5 is when the RTX 5070 arrives on shelves. So, as we theorized earlier this week, this is looking very much like a mid-range head-to-head between the RX 9070 and RTX 5070 in the first week of March.

As for the PSU requirement, I wouldn’t panic about the potential scenario of RX 9070 models somehow requiring vast reservoirs of power. Some of the beefiest models may, but we must remember, the Red Devil board mentioned in the leak is a top-end graphics card, and as Azor observed, other models will require less wattage. Indeed, the rumor is that the reference 9070 XT from AMD will ask for a 750W PSU, with the RX 9070 vanilla needing 650W, far more modest requirements (add seasoning with that still).

It’s also worth noting that 900W is an odd specification here, given that there aren’t any PSUs delivering that exact figure, as far as I’m aware. There are 850W models and then we jump to 1000W, so why PowerColor is (theoretically) placing the requirement just above 850W, in a non-existent PSU bracket (as it were), I’m not sure. This could perhaps be another suggestion that the image is faked.

That said, I don’t doubt that a heavily overclocked RX 9070 XT model will drink a lot more juice than a standard board. It clearly will, and so it wouldn’t be a surprise if the top dogs of the RDNA 4 graphics card world are considerably more demanding on the PC’s power supply. These GPUs will also cost a lot more than the entry-level 9070 XT products, too, and how competitive AMD’s graphics cards will be in pricing terms is the other key question we’re dying to have answered.

We’ll have those answers soon enough, thankfully. Roll on the end of February.

You might also like…

Source

Posted on

EY: Industrial companies worldwide stunted in emerging technology use

Many companies from a range of industries worldwide are stuck at a trial stage of emerging technologies usage, according to the sixth annual EY Reimagining industry futures study.

The firm surveyed 1,635 enterprises in November 2024, including 9% in the UK, 6% in Germany and 20% in the US. Respondents were drawn from a range of industries, including financial services (13%), cars and transport (13%), energy, mining and utilities (13%), and manufacturing (12%).

The study has a strong 5G and internet of things (IoT) orientation, as it has done in previous years. The lead authors come from the firm’s telecoms, media and technology (TMT) practices, Rob Atkinson, area managing partner for UK and Ireland TMT; and Adrian Baschnonga, global TMT lead analyst.

The press statement that goes with the report says nearly half (47%) of the respondents are investing in generative artificial intelligence (GenAI), compared with 43% last year.

Some 43% are investing in IoT, and 33% are investing in 5G technology, suggesting an upward trend from 39% and 27% respectively in 2024.

However, the report finds that businesses struggle to convert technology trials into live deployments. Only 1% of organisations have active deployments of GenAI. And while IoT investment seems to be rising year-on-year, the proportion of businesses with active IoT deployments is in decline, slipping to 16% this year compared with 19% in 2024.

Active deployments of edge computing are also flat year-on-year, at 22%.

CEOs get more into technology selection

The report finds decision-making inside enterprises is spreading out across the C-suite, with 49% of CEOs now involved in emerging technology strategy, including in choice of suppliers. Organisations where the CEO is a key decision-maker are further along, the report found.

Over half (51%) of businesses with CEOs involved in new technology decisions are investing in GenAI, compared with 44% of organisations where the CEO is less involved.

“As well as posing a challenge to unlocking long-term value, a failure to progress beyond the trial phase means businesses risk missing out on the combined impact of different technologies deployed together, an area where four in five (79%) organisations are looking to achieve more,” said Atkinson. “There could also be a danger that too many emerging technologies initiatives will be conducted in isolation, limiting the resulting business benefits.”

The report discovered that respondents are limitedly aware of what IT suppliers have on offer.

Some 73% said they need a better understanding of the changing supplier landscape. EY comments that this reflects “an environment where collaborative ecosystems featuring alliances between different technology providers are becoming the norm”.

More than half (56%) the respondents believe they lack awareness of their technology suppliers’ partners. Less than a third of organisations have high awareness of new mobile technology capabilities such as network application programming interfaces (32%) and network slicing (26%).

“Organisations view ecosystem collaboration as a route to access new skills and capabilities but lack understanding of changing supplier ecosystems,” said Baschnonga. “With many companies under pressure to consolidate vendors, suppliers should prioritise their ecosystem and alliance strategies by concentrating on key partners and adapting their operating models and go-to-market approaches accordingly.”

The report found the ability to scale and integrate different technologies is important to one in four (25%) of those surveyed.

“The intention to focus spending on a smaller number of key suppliers makes it even more important that ICT providers present them as effective ecosystem orchestrators, able to provide end-to-end solutions with the assistance of partners and intermediaries,” said Atkinson. “As part of this, suppliers should take care to underline capabilities that extend beyond their core products.

“While enterprises remain committed to embracing leading-edge technologies like GenAI, IoT and 5G, they are facing challenges in translating their investments into real business value,” he said. “Now is the time for IoT suppliers to reposition themselves as holistic partners to their business customers and help them realise the full benefits of their spending on digital transformation.”

In the report itself, the authors say: “This year’s findings show that organisations across all sectors remain committed to investing in emerging technologies to transform their operations – but that issues around scalability and legacy integration are top of mind. Meanwhile, ICT vendors need to pay close attention to enterprises’ increasing focus on security and growing demand for ecosystem orchestration.”

Businesses dimly aware of datacentre environmental impact

They also pick out sustainability as an increasingly relevant theme for enterprise IT, especially with regard to datacentres. “Sustainability factors increasingly weigh on decisions about emerging technology investments, with organisations more sensitive than before to the potentially ambivalent role of new technologies in the decarbonisation agenda.

Datacentres, the report’s authors comment, are an area of low environmental, social, and governance awareness for businesses. Half the organisations surveyed are unaware of their datacentres’ emissions profiles.

Respondents are looking at a range of GenAI use cases, with no standout preferences, the report found. Some 50% of businesses see cyber security and data protection as a leading GenAI impediment, while 46% said a need to improve data governance to combat risks concerning data accuracy and ethics would be critical to future implementations.

Data governance scores highest among manufacturers (46%) as a GenAI concern, while capturing productivity gains ranks top among EY’s respondents in the consumer (48%) and energy (47%) sectors.

Across all sectors, the most favoured GenAI use cases are software development, customer service and employee training or collaboration. However, financial services, healthcare and manufacturing respondents rated predictive or real-time operations and supply chain management as top-five GenAI use cases.

Upskilling and more collaboration

The report says the two most important changes that organisations can make are employee upskilling and deeper collaboration across business functions.

On a country level, education and employee upskilling is highly ranked by German respondents (36%), while deeper collaboration between business functions leads as an action among Chinese businesses (31%).

Elsewhere, Indian (20%) and Japanese (18%) businesses are most likely to prioritise collaboration with suppliers.

Source

Posted on

Galaxy S25 Edge specs leak teases an amazing ultra-thin phone

People in attendance at the Galaxy S25 Unpacked event last month might’ve been surprised by the product Samsung used to open the show: the new ultra-thin Galaxy S25 Edge. Then, during the hands-on experience, people crowded the Galaxy S25 Edge tables to take photos of the sleek new phone that Samsung teased briefly on stage.

Despite not being launched officially, the Galaxy S25 Edge was the big star of the show, and I’m not surprised. It can’t be just me who is interested in an ultra-thin flagship phone. I’m already looking forward to buying the iPhone 17 Air that Apple is expected to introduce this year. That’s the iPhone model that supposedly inspired Samsung to rush and unveil the Galaxy S25 Edge before Apple shows off its super-thin handset in September.

Samsung did not reveal Galaxy S25 Edge specs details at the show. While we speculated the handset would have high-end hardware like the rest of the Galaxy S25 series, we didn’t get actual confirmation at the event.

Now, someone posted on YouTube a very early hands-on video showing the Galaxy S25 Edge in action, complete with specs. The video was quickly pulled, but it confirmed leaks saying that the handset will rock high-end hardware despite being so thin.

Tech. Entertainment. Science. Your inbox.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

According to SamMobile, which was among those quick enough to see the video before it was removed, the Galaxy S25 Edge is under 6mm thin. This seems to confirm rumors pointing to a profile of 5.84mm. Comparatively, the base Galaxy S25 model is 7.2mm thick.

The same YouTuber also provided the main hardware specs for the Galaxy S25 Edge units they tested. The phone features the Snapdragon 8 Elite chip, 12GB of RAM, 256GB of storage, and a 4,000 mAh battery.

The specs app the YouTuber used also provided purported camera details, suggesting the phone has three 12-megapixel cameras. However, the Galaxy S25 Edge only features two cameras on the back.

SamMobile explained that the app used to collect specs details usually messes up camera hardware, as the app looks at the default resolution of photos. The Galaxy S25 is rumored to feature a 200-megapixel main sensor rather than a 12-megapixel camera.

Camera specs aside, the rest of the hardware specs tell an impressive story. The Galaxy S25 Edge will apparently feature specs on par with the thicker Galaxy S25 phones.

All three Galaxy S25 models pack the same Snapdragon 8 Elite chip and pack at least 12GB of RAM. The list includes the Galaxy S25 Ultra, the best Galaxy S25 flavor you can buy.

Samsung Galaxy S25+ blue backSamsung Galaxy S25 Plus in blue back. Image source: Christian de Looper for BGR

The base Galaxy S25 model, which we’ve called maybe the most minor update in history, starts at 128GB of storage. The Galaxy S25 Plus model above starts at 256GB of storage.

More interesting is the Galaxy S25 Edge battery claim. At 4,000 mAh, the battery is as big as the Galaxy S25’s battery pack. However, given the phone’s nimble profile, the Galaxy S25 Edge battery has to be thinner and taller.

It’s unclear what compromises Samsung might have made to reduce the Galaxy S25 Edge’s thickness. The Snapdragon 8 Elite chip might be underclocked compared to the regular version. The phone’s thickness might have impacted the cooling system and other internal components. 

The battery might match the Galaxy S25’s size, but the phone should have a larger screen, which will inevitably consume more energy.

Even if the specs aren’t confirmed for the camera, we know the handset will feature two lenses instead of three.

That said, I’m cautiously optimistic about the Galaxy S25 Edge. The phone will have high-end specs in an ultra-thin body.

Samsung’s confidence in its ability to pull off a high-end ultra-thin Galaxy S phone makes me even more excited about the iPhone 17 Air. Remember that Samsung would have never made an ultra-thin Galaxy S phone without Apple first making a slim iPhone.

With March approaching, we’re getting closer to the Galaxy S25 Edge launch event. Samsung only teased the handset in January, but the ultra-thin phone should be launched in the second quarter.

Source

Posted on

DeepSeek-R1: Budgeting challenges for on-premise deployments

Until now, IT leaders have needed to consider the cyber security risks posed by allowing users to access large language models (LLMs) like ChatGPT directly via the cloud. The alternative has been to use open source LLMs that can be hosted on-premise or accessed via a private cloud. 

The artificial intelligence (AI) model needs to run in-memory and, when using graphics processing units (GPUs) for AI acceleration, this means IT leaders need to consider the costs associated with purchasing banks of GPUs to build up enough memory to hold the entire model.

Nvidia’s high-end AI acceleration GPU, the H100, is configured with 80Gbytes of random-access memory (RAM), and its specification shows it’s rated at 350w in terms of energy use.

China’s DeepSeek has been able to demonstrate that its R1 LLM can rival US artificial intelligence without the need to resort to the latest GPU hardware. It does, however, benefit from GPU-based AI acceleration.

Nevertheless, deploying a private version of DeepSeek still requires significant hardware investment. To run the entire DeepSeek-R1 model, which has 671 billion parameters in-memory, requires 768Gbytes of memory. With Nvidia H100 GPUs, which are configured with 80GBytes of video memory card each, 10 would be required to ensure the entire DeepSeek-R1 model can run in-memory. 

IT leaders may well be able to negotiate volume discounts, but the cost of just the AI acceleration hardware to run DeepSeek is around $250,000.

Less powerful GPUs can be used, which may help to reduce this figure. But given current GPU prices, a server capable of running the complete 670 billion-parameter DeepSeek-R1 model in-memory is going to cost over $100,000.

The server could be run on public cloud infrastructure. Azure, for instance, offers access to the Nvidia H100 with 900 GBytes of memory for $27.167 per hour, which, on paper, should easily be able to run the 671 billion-parameter DeepSeek-R1 model entirely in-memory.

If this model is used every working day, and assuming a 35-hour week and four weeks a year of holidays and downtime, the annual Azure bill would be almost $46,000 a year. Again, this figure could be reduced significantly to $16.63 per hour ($23,000) per year if there is a three-year commitment.

Less powerful GPUs will clearly cost less, but it’s the memory costs that make these prohibitive. For instance, looking at current Google Cloud pricing, the Nvidia T4 GPU is priced at $0.35 per GPU per hour, and is available with up to four GPUs, giving a total of 64 Gbytes of memory for $1.40 per hour, and 12 would be needed to fit the DeepSeek-R1 671 billion-parameter model entirely-in memory, which works out at $16.80 per hour. With a three-year commitment, this figure comes down to $7.68, which works out at just under $13,000 per year.

A cheaper approach

IT leaders can reduce costs further by avoiding expensive GPUs altogether and relying entirely on general-purpose central processing units (CPUs). This setup is really only suitable when DeepSeek-R1 is used purely for AI inference.

A recent tweet from Matthew Carrigan, machine learning engineer at Hugging Face, suggests such a system could be built using two AMD Epyc server processors and 768 Gbytes of fast memory. The system he presented in a series of tweets could be put together for about $6,000.

Responding to comments on the setup, Carrigan said he is able to achieve a processing rate of six to eight tokens per second, depending on the specific processor and memory speed that is installed. It also depends on the length of the natural language query, but his tweet includes a video showing near-real-time querying of DeepSeek-R1 on the hardware he built based on the dual AMD Epyc setup and 768Gbytes of memory.

Carrigan acknowledges that GPUs will win on speed, but they are expensive. In his series of tweets, he points out that the amount of memory installed has a direct impact on performance. This is due to the way DeepSeek “remembers” previous queries to get to answers quicker. The technique is called Key-Value (KV) caching.

“In testing with longer contexts, the KV cache is actually bigger than I realised,” he said, and suggested that the hardware configuration would require 1TBytes of memory instead of 76Gbytes, when huge volumes of text or context is pasted into the DeepSeek-R1 query prompt.

Buying a prebuilt Dell, HPE or Lenovo server to do something similar is likely to be considerably more expensive, depending on the processor and memory configurations specified.

A different way to address memory costs

Among the approaches that can be taken to reduce memory costs is using multiple tiers of memory controlled by a custom chip. This is what California startup SambaNova has done using its SN40L Reconfigurable Dataflow Unit (RDU) and a proprietary dataflow architecture for three-tier memory.

“DeepSeek-R1 is one of the most advanced frontier AI models available, but its full potential has been limited by the inefficiency of GPUs,” said Rodrigo Liang, CEO of SambaNova.

The company, which was founded in 2017 by a group of ex-Sun/Oracle engineers and has an ongoing collaboration with Stanford University’s electrical engineering department, claims the RDU chip collapses the hardware requirements to run DeepSeek-R1 efficiently from 40 racks down to one rack configured with 16 RDUs.

Earlier this month at the Leap 2025 conference in Riyadh, SambaNova signed a deal to introduce Saudi Arabia’s first sovereign LLM-as-a-service cloud platform. Saud AlSheraihi, vice-president of digital solutions at Saudi Telecom Company, said: “This collaboration with SambaNova marks a significant milestone in our journey to empower Saudi enterprises with sovereign AI capabilities. By offering a secure and scalable inferencing-as-a-service platform, we are enabling organisations to unlock the full potential of their data while maintaining complete control.”

This deal with the Saudi Arabian telco provider illustrates how governments need to consider all options when building out sovereign AI capacity. DeepSeek demonstrated that there are alternative approaches that can be just as effective as the tried and tested method of deploying immense and costly arrays of GPUs.

And while it does indeed run better, when GPU-accelerated AI hardware is present, what SambaNova is claiming is that there is also an alternative way to achieve the same performance for running models like DeepSeek-R1 on-premise, in-memory, without the costs of having to acquire GPUs fitted with the memory the model needs.

Source