Posted on

DeepSeek-R1: Budgeting challenges for on-premise deployments

Until now, IT leaders have needed to consider the cyber security risks posed by allowing users to access large language models (LLMs) like ChatGPT directly via the cloud. The alternative has been to use open source LLMs that can be hosted on-premise or accessed via a private cloud. 

The artificial intelligence (AI) model needs to run in-memory and, when using graphics processing units (GPUs) for AI acceleration, this means IT leaders need to consider the costs associated with purchasing banks of GPUs to build up enough memory to hold the entire model.

Nvidia’s high-end AI acceleration GPU, the H100, is configured with 80Gbytes of random-access memory (RAM), and its specification shows it’s rated at 350w in terms of energy use.

China’s DeepSeek has been able to demonstrate that its R1 LLM can rival US artificial intelligence without the need to resort to the latest GPU hardware. It does, however, benefit from GPU-based AI acceleration.

Nevertheless, deploying a private version of DeepSeek still requires significant hardware investment. To run the entire DeepSeek-R1 model, which has 671 billion parameters in-memory, requires 768Gbytes of memory. With Nvidia H100 GPUs, which are configured with 80GBytes of video memory card each, 10 would be required to ensure the entire DeepSeek-R1 model can run in-memory. 

IT leaders may well be able to negotiate volume discounts, but the cost of just the AI acceleration hardware to run DeepSeek is around $250,000.

Less powerful GPUs can be used, which may help to reduce this figure. But given current GPU prices, a server capable of running the complete 670 billion-parameter DeepSeek-R1 model in-memory is going to cost over $100,000.

The server could be run on public cloud infrastructure. Azure, for instance, offers access to the Nvidia H100 with 900 GBytes of memory for $27.167 per hour, which, on paper, should easily be able to run the 671 billion-parameter DeepSeek-R1 model entirely in-memory.

If this model is used every working day, and assuming a 35-hour week and four weeks a year of holidays and downtime, the annual Azure bill would be almost $46,000 a year. Again, this figure could be reduced significantly to $16.63 per hour ($23,000) per year if there is a three-year commitment.

Less powerful GPUs will clearly cost less, but it’s the memory costs that make these prohibitive. For instance, looking at current Google Cloud pricing, the Nvidia T4 GPU is priced at $0.35 per GPU per hour, and is available with up to four GPUs, giving a total of 64 Gbytes of memory for $1.40 per hour, and 12 would be needed to fit the DeepSeek-R1 671 billion-parameter model entirely-in memory, which works out at $16.80 per hour. With a three-year commitment, this figure comes down to $7.68, which works out at just under $13,000 per year.

A cheaper approach

IT leaders can reduce costs further by avoiding expensive GPUs altogether and relying entirely on general-purpose central processing units (CPUs). This setup is really only suitable when DeepSeek-R1 is used purely for AI inference.

A recent tweet from Matthew Carrigan, machine learning engineer at Hugging Face, suggests such a system could be built using two AMD Epyc server processors and 768 Gbytes of fast memory. The system he presented in a series of tweets could be put together for about $6,000.

Responding to comments on the setup, Carrigan said he is able to achieve a processing rate of six to eight tokens per second, depending on the specific processor and memory speed that is installed. It also depends on the length of the natural language query, but his tweet includes a video showing near-real-time querying of DeepSeek-R1 on the hardware he built based on the dual AMD Epyc setup and 768Gbytes of memory.

Carrigan acknowledges that GPUs will win on speed, but they are expensive. In his series of tweets, he points out that the amount of memory installed has a direct impact on performance. This is due to the way DeepSeek “remembers” previous queries to get to answers quicker. The technique is called Key-Value (KV) caching.

“In testing with longer contexts, the KV cache is actually bigger than I realised,” he said, and suggested that the hardware configuration would require 1TBytes of memory instead of 76Gbytes, when huge volumes of text or context is pasted into the DeepSeek-R1 query prompt.

Buying a prebuilt Dell, HPE or Lenovo server to do something similar is likely to be considerably more expensive, depending on the processor and memory configurations specified.

A different way to address memory costs

Among the approaches that can be taken to reduce memory costs is using multiple tiers of memory controlled by a custom chip. This is what California startup SambaNova has done using its SN40L Reconfigurable Dataflow Unit (RDU) and a proprietary dataflow architecture for three-tier memory.

“DeepSeek-R1 is one of the most advanced frontier AI models available, but its full potential has been limited by the inefficiency of GPUs,” said Rodrigo Liang, CEO of SambaNova.

The company, which was founded in 2017 by a group of ex-Sun/Oracle engineers and has an ongoing collaboration with Stanford University’s electrical engineering department, claims the RDU chip collapses the hardware requirements to run DeepSeek-R1 efficiently from 40 racks down to one rack configured with 16 RDUs.

Earlier this month at the Leap 2025 conference in Riyadh, SambaNova signed a deal to introduce Saudi Arabia’s first sovereign LLM-as-a-service cloud platform. Saud AlSheraihi, vice-president of digital solutions at Saudi Telecom Company, said: “This collaboration with SambaNova marks a significant milestone in our journey to empower Saudi enterprises with sovereign AI capabilities. By offering a secure and scalable inferencing-as-a-service platform, we are enabling organisations to unlock the full potential of their data while maintaining complete control.”

This deal with the Saudi Arabian telco provider illustrates how governments need to consider all options when building out sovereign AI capacity. DeepSeek demonstrated that there are alternative approaches that can be just as effective as the tried and tested method of deploying immense and costly arrays of GPUs.

And while it does indeed run better, when GPU-accelerated AI hardware is present, what SambaNova is claiming is that there is also an alternative way to achieve the same performance for running models like DeepSeek-R1 on-premise, in-memory, without the costs of having to acquire GPUs fitted with the memory the model needs.

Source

Posted on

Balancing act: Managing business needs alongside digital transformation and innovation

When building a startup, there is a real balancing act between managing expectations, educating on what’s possible, and identifying the true cost of innovation. CTOs are challenged not only to build functional technology platforms quickly, but to do so as cost effectively as possible.

Startups are often not profitable therefore don’t have a lot of cash to burn, meaning the CTO has to deliver technology solutions to solve their business goals on a limited budget.

Let’s look at a legacy industry like commercial insurance – it’s been undergoing a transformation in recent years. The industry is data and human heavy and is heavily regulated which is why it’s ripe for innovation. It is also playing catch-up to address the needs of many consumers who want a seamless user experience and businesses that want a modern experience – faster, streamlined, digitised, and so on – when dealing with insurance providers. This is particularly true of the on-demand economy.

Leveraging technology

The on-demand economy is characterised by the likes of Taskrabbit, Doordash, Uber, Deliveroo and Amazon Flex. But it’s the likes of hard working on-demand taxi and delivery drivers who are calling for flexible insurance that caters to their very specific needs which enables them to buy comprehensive coverage for when they’re driving, and to switch it off when they’re not.

However, many insurtechs have not adequately met these needs despite their ability to leverage technology more nimbly and effectively than traditional players. The business of insurance is complicated and innovation cannot be retrofitted with existing tech, which is why it’s vital to have a deep understanding of what the requirements are between the customer, the insurance partners and platforms like Uber and Amazon, for instance.

Transforming the on-demand insurance industry is a symbiotic relationship between the customer, the insurance provider and the platform. Although it can deliver real results for all, it also comes with its share of unique challenges.

Loss ratio – how much an insurance company spends on claims compared to the premiums it receives – is a key indicator of profitability. When insurtech startups focus too much on showy AI-driven gimmicks such as automatic claims payments within seconds, loss ratios suffer – and crucial insurance industry partners back away quickly. In the world of insurance, “innovation at all costs” simply doesn’t work.

But technology cannot simply operate as a cost centre. By working in partnership with the rest of the business, startup CTOs and their teams need to focus on building an ongoing technology foundation to drive innovation within legacy industry structures and processes, driving business growth as well as consistent results for customers and partners.

Tech as augmentor – not replacement

Many of the challenges CTOs face aren’t necessarily about technology, but the change of mindset required when implementing tech solutions. Until very recently, insurance was an industry dominated by traditional players, governed by outdated systems and processes. While this is changing, there are still areas where bridges must be built between the promise of what technology can deliver and a certain “this is how it’s always been done” mindset.

For example, we know that insurance, like many industries, is ripe for reinvention through smart uses of AI – as long as it is implemented in the most appropriate areas of the business, and used as an augmented assistant rather than a replacement for specialist expertise.

Chris Gray headshot

“Many of the challenges CTOs face aren’t necessarily about technology, but the change of mindset required when implementing tech solutions”

Chris Gray, Inshur

At Inshur, working in combination with a team from Google Cloud, we were able to build an AI assistant for our claims team and demonstrate to management its effectiveness in helping the team prioritise work as well as speeding up administrative tasks, while providing fast and effective customer service. We’re continuing to roll out this technology internationally, as well as add further features to augment the human adjusters and utilise their expertise while saving them time.

The assistant helps the team to quickly scan incoming documents, including email, physical letters, attachments or transcribed phone calls; infer the data, including who is the sender and the intention of the communication; identify important and useful information such as vehicle registration and claimant name; identify the priority and urgency of the claim; assign it to the right team; and summarise the data into a standard format for ease of use. By automatically accepting feedback, retraining, and learning from past actions, the assistant also helps guide handlers with proposed next steps, helping to train new claims handlers.

The AI-based tools we built to support our claims teams have enabled us to see patterns that are also a good fit for other departments within the business. So much so, that we see potential for the commoditisation of these approaches to a wider set of solutions that serves not just insurance, but any business.

Build or buy?

Another question a lot of startup CTOs are asked is whether to build or buy. Building tech solutions from scratch can carry significant risk, especially given the resource investment typically required. But when every business in a given market is using the same platforms – usually with significant tweaks and workarounds to fit their specific needs – then nobody can truly win the innovation race.

First-movers must always be willing to build when necessary, and to buy when prudent.

For example, we decided that we needed to invest in developing our own solutions to problems that could not be adequately solved by off-the-shelf products. One such product is our Pay-as-you-flex wallet for Amazon Flex. While traditional insurance has historically covered drivers at all times, including when they’re not driving, we knew that technology held the key to delivering a new insurance product that would enable delivery drivers to pay only for the cover they needed, when they needed it.

As the first-of-its-kind to enter the market, we knew that we’d need to build it from scratch.

It’s only since we built our proprietary platform to manage business-critical processes including policy administration, claims management and billing that similar products have entered the market. By building a platform that’s fully tailored to the specific needs of the market we serve, we’ve paved the way for other insurers to do the same for their customers and partners.

However, the startup CTO must also take the lead in conversations where buying makes most sense, securing buy-in from other senior stakeholders and identifying the most appropriate vendors to partner with. Often, particularly in a high-growth startup where cost and return on investment are key considerations, this will involve a detailed assessment of risk for all available scenarios.

In Inshur’s case, we’re working with Google Cloud to implement several of its AI products to drive efficiencies and ensure that customers are treated fairly – which is both a regulatory and moral imperative in the insurance industry.

We know that our customers drive for a living, which means they often need to call us via their hands-free mobile technology while driving in between journeys, rather than emailing or speaking to a text-based chatbot. 

When we identified that a significant proportion of the calls coming into our customer service team could be quickly and effectively answered by an AI-driven solution, we implemented a “smart virtual agent” to handle more straightforward queries, enabling the team to focus more on serving customers with specific or detailed questions.

Bridging the gap

Because of the crucial role technology such as AI will play in the coming years, CTOs will need to ensure they are consistently developing deep understanding and expertise, not just in the latest technology innovations but also how they can be implemented to drive business strategy and growth.

Crucially, this will include taking a leadership role in helping to educate stakeholders across the business on the best use cases for AI tools and other solutions, building understanding at every level around what the technology can and can’t help with, and putting clear structure and process around innovation.

This ability to bridge the gap between the business and technology is already becoming a crucial indicator of future success.

Chris Gray is chief technology officer at vehicle insurance provider Inshur.

Source

Posted on

Gemini AI just got a new feature ChatGPT can’t match yet

The smarter AI programs like ChatGPT and Gemini become, the more we’ll want to use them as the virtual assistants they can be. For that to happen, we’ll need the AIs to access information about us from all sorts of apps and remember details about us. We’ll also need to be able to trust companies like OpenAI and Google with increasingly more personal data.

OpenAI was the first to bring memory features to ChatGPT. It happened with Custom Instructions, a feature I’ve used since it became available. About a year ago, OpenAI also added a Memory feature to ChatGPT that allowed it to remember things about users from chats beyond the scope of Custom Instructions. All of this happens with the user’s knowledge, and memories can be deleted at any time. Also, they don’t train the AI if you set your ChatGPT privacy preferences correctly.

Gemini needed more time to get memory features similar to ChatGPT. Google rolled out the first memory features in November, but they’re available to Gemini Advanced subscribers. ChatGPT Memory features are also available to paying ChatGPT users.

However, Google has now improved Gemini’s memory in a way that OpenAI hasn’t. You can tell Gemini to recall information from your previous chats with the AI on a similar topic, which can be handy for picking up a conversation on the same subject.

Tech. Entertainment. Science. Your inbox.

Sign up for the most interesting tech & entertainment news out there.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

“Starting today, Gemini can now recall your past chats to provide more helpful responses,” Google said in a blog on Thursday. “Whether you’re asking a question about something you’ve already discussed, or asking Gemini to summarize a previous conversation, Gemini now uses information from relevant chats to craft a response.”

While I have Custom Instructions enabled in ChatGPT and update them from time to time, I’m not using the memory feature. I don’t fully trust the AI to remember information about me, not that I provide information that might be too personal to hand over to the AI to begin with.

However, Google’s upgrade for Gemini is something I’d want from ChatGPT. The ability for ChatGPT to recall some conversations on a similar topic would certainly come in handy, as it would prevent me from having identical chats. That can happen from time to time.

I will remind you that ChatGPT Search did give ChatGPT a major UI overhaul, allowing users to search for previous chats. This makes it somewhat easier to recall past conversations, but I have to do it manually. Also, ChatGPT supports folders, so I can combine similar chats in the same folder to streamline my interactions with the AI.

Google’s way is better. I’d want to tell the AI to look at past conversations and find relevant information. This isn’t necessarily the same thing as the memory feature. It’s just giving the AI access to my chat data already stored in my account with a twist. I’d be able to manage what data the AI sees.

Google says that’s the case with Gemini:

You’re in control over what information is stored. You can easily review, delete or decide how long to keep your chat history. You can also turn off Gemini Apps Activity altogether by going to My Activity. Gemini may indicate when it uses your past chats in sources and related content.

The new memory feature is rolling out in English and you’ll need a Gemini Advanced subscription via the Google One AI Premium Plan. This subscription also gives you access to Google Cloud storage, which makes it a better deal than ChatGPT Plus.

Google Workspace Business and Enterprise subscribers will also get the feature in the coming weeks.

Source

Posted on

AWS and Microsoft could face ‘targeted intervention’ from CMA over UK cloud competition concerns

The competition watchdog has published the provisioning findings from its long-running investigation into the inner workings of the UK cloud infrastructure services market, which shows that competition in the sector is not working as well as it could be. For this reason, Kip Meek, chair of the CMA’s independent inquiry group, said it is advising the regulator to “consider investigating the largest cloud service providers using its new digital markets powers”.

This is because its findings suggest end-user organisations could be paying more than they need for cloud services, and are possibly at risk of being locked into using platforms that do not meet their “evolving” needs.

In a seven-page report, detailing the provisional findings of its investigation, the CMA said the lack of competition in the cloud market could mean UK customers are collectively paying hundreds of millions more per year than they need to for services.

It went on to state that UK cloud users can be locked into their “initial choice of provider” due to technical and commercial barriers that prevent customers from seeking out the services of other cloud suppliers who might have better-priced or a more innovative portfolio of services.

“We have provisionally found that AWS and Microsoft have been generating sustained returns from their cloud services substantially above their cost of capital in cloud services for a number of years,” the report said. “Customers say that cloud services offer both quality and innovation to them. However, we consider that a more competitive market would have sustained better market outcomes, including more consistently competitive prices, as well as further improvements in quality and innovation.”

Controversial licensing practices

The report also called out Microsoft’s controversial licensing practices, which typically see it charging customers more for running its software in its competitors’ cloud, as impacting on the competitive position of AWS and also Google by “partially foreclosing” them from the market.

As well as being in-scope of the CMA probe, Microsoft’s behaviour on this front is also the subject of a European Commission complaint, filed by Google in September 2024.

“[The licensing piece] exacerbates the harm we have provisionally found arising from high market concentration and barriers to entry and expansion in relation to Microsoft’s significant unilateral market power,” the report added.

To remedy the situation, the report suggests the CMA board should use powers conferred on it through the roll-out of the Digital Markets, Competition and Consumers Act 2024 (DMCCA) on 1 January 2025 to mark AWS and Microsoft out as suppliers with “strategic market status”.

This would mean the CMA could impose legally binding conduct requirements or pro-competition interventions on both firms to limit and remedy the toll their activities have allegedly had on the market.

As detailed in the report, such powers are “specifically designed to be effected in digital markets … that share a combination of characteristics that can cause them to ‘tip’ in favour of one or a few firms” by allowing the CMA to take a “targeted and iterative” approach to tackling the behaviour of such providers.

“We consider that measures aimed at AWS and Microsoft would address market-wide concerns by directly benefiting the majority of UK customers and producing wider, indirect effects by altering the competitive conditions or other providers,” the report stated.

Before any action can be taken by the CMA, a consultation on the provisional findings of its investigation needs to take place, with cloud market stakeholders now invited to share their feedback on the conclusions raised so far. The final report from the CMA’s investigation is due to drop by 4 August 2025.

In the meantime, AWS has responded to the CMA’s provisional findings by describing its proposed intervention under the terms of the DMCCA as “not warranted”, and urged it to think about the long-term impact of such a move.

“We urge the CMA to carefully consider how regulatory intervention in other areas will stifle innovation and ultimately harm customers in the UK,” a spokesperson for AWS said. “We will continue to work constructively with the CMA as they work on their final report.”

Rima Alaily, corporate vice-president and deputy general counsel in the competition law group at Microsoft, seemed to suggest in a statement to Computer Weekly that the contents of the CMA report are mistargeted. 

“The draft report should be focused on paving the way for the UK’s AI-powered future, not fixating on legacy products launched in the last century,” she said. “The cloud computing market has never been so dynamic and competitive, attracting billions in investments, new entrants and rapid innovation. What could be better for UK businesses and government?”

Meanwhile, Chris Lindsay, vice-president of customer engineering for Europe, the Middle East and Africa at Google Cloud, said the company was pleased to see the impact that restrictive licensing practices have on cloud customers feature in the CMA’s provisional findings.  

“Restrictive licensing harms UK cloud customers, threatens economic growth and stifles innovation, and we are encouraged that the CMA has recognised the harm of these practices,” he said.

Source

Posted on

Top 10 AI and storage stories of 2024

Artificial intelligence (AI) has hit the headlines and the datacentres, but with it comes a range of performance and operating considerations that impact storage as much as any other IT discipline.

In this review, we look at the key demands of AI processing on data storage, the type of storage AI requires, and the suitability of cloud storage for AI workloads.

We drill down into the data needs of AI and storage, such as the demands of high-dimension vector data and checkpointing during AI training, plus the compliance considerations that use of AI brings with it.

We also look at the responses of storage suppliers to the rapid rise of AI use cases in the datacentre, in terms of link-ups with leading players like Nvidia, as well as in their storage offer aimed at AI workloads. 

In this guide, we examine the data storage needs of artificial intelligence, the demands it places on data storage, the suitability of cloud and object storage for AI, and key AI storage products.

We look at the use of vector data in AI and how vector databases work, plus vector embedding, the challenges for storage of vector data and the key suppliers of vector database products.

We talk to Charlie Boyle of Nvidia about data challenges in artificial intelligence, key practical tips for AI projects, and demands on storage of training, inferencing, RAG and checkpointing.

Storage supplier announcements at Nvdia conference centre on infrastructure integration, tackling the GPU I/O bottleneck and AI hallucinations by running Nvidia NeMo and NIM microservices.

We spoke to Pure Storage CEO Charlie Giancarlo about why write speed is key for artificial intelligence workloads, accessible storage for AI data, and his prediction of the death of spinning disk.

We talk to NetApp’s Grant Caley about AI and data storage, the need for scale, performance and hybrid cloud, and to move, copy and clone data for wrangling for inference runs.

AI checkpointing operations targeted by Vast Data as it touts QLC-based storage for AI workloads.

Start looking at artificial intelligence compliance. That’s the advice of Mathieu Gorge of Vigitrust, who says AI governance is still immature, but firms should recognise the limits and still act.

AI consultancy Crater Labs spent vast amounts of time managing server-attached drives to ensure GPUs were saturated. A shift to all-flash Pure Storage slashed that to almost zero.

Originally driven by Intel’s now-defunct Optane storage class memory, Parallelstore offers massive parallel file storage targeted at artificial intelligence training use cases on Google Cloud.

Source

Posted on

Are you on the naughty or nice list for responsible AI adoption?

Over the past year, artificial intelligence (AI) has proved its worth as a long-term investment for businesses. It brings a range of perfectly wrapped presents to the table, making a significant impact on productivity, efficiency, and automation across business functions. With almost 40% of companies worldwide already using AI in some form, it’s undeniable that it has the capability to revolutionise business operations.

For example, Santa’s workshop would benefit from AI adoption in automation of its supply chain orders, faster and more accurate analysis of wish list data, and tracking of items that have made it into his sleigh.

To ensure he makes the most of AI’s benefits, Santa will have brought it on board with ethical guidelines and responsible practices in mind. But have you? Whether you’ve already adopted and want to make sure you’re using AI responsibly, or you’re yet to adopt and are looking to integrate ethical standards into your plan – time’s running out to get onto Santa’s nice list before Christmas.

Getting into the good books with responsible adoption

Adopting AI responsibly isn’t just about avoiding risks, it’s also a way of setting the stage for sustainable growth, efficiency, and innovation. If you jump on the AI bandwagon without building a solid foundation and outlining a clear strategy, a myriad of risks can await your business. Data breaches, ethical challenges, and financial losses are all risks businesses face if they ignore the importance of responsible adoption.

The most effective way of adopting AI to mitigate these risks is a responsible one, and it’s not as easy as plugging in your Christmas lights. Smart and strategic choices are the key to protecting business data and aligning AI initiatives with business goals.

Santa’s top tips for adopting responsibly

Like writing a Christmas shopping list, AI adoption can be too daunting to start for lots of businesses. With so much information out there, where are you meant to start?

The key is pushing fear to the side and making any type of start, even if it’s small. Those who start now and invest in AI will stay ahead of the curve. But like Rudolph and his crew, the AI gap is real, and businesses who don’t get on board now will be left behind. So, what do you need to consider to adopt AI responsibly?

  • Make sure your data shines like a bauble

Squeaky clean data is crucial to getting reliable insights from AI. Getting AI ready means prepping business operations for AI systems to easily slot in, so business data needs to be accurate, void of bias, and ready for action.

The same way you wouldn’t send Santa a disorganised wish list, you wouldn’t give AI messy data. Making sure data is up to date, without errors or duplicates, is critical to ensuring your AI delivers real value. This comes hand-in-hand with assessing your internal resources, and making sure your infrastructure can handle the scale and power of AI demands. More flexible Cloud platforms like AWS, Google Cloud, and Azure can help business scale AI cost-effectively.

  • Embrace elf-level organisation

Training is a key part of onboarding AI. Do you think Santa’s elves are expected to wrap presents without being trained first? Preparation for AI use is essential to allowing your employees to understand its benefits and using it effectively.

As it affects every team in the business, not just the IT department, the entire workforce needs to be prepped for AI adoption. Whilst this can seem like a costly task, investing in your people is how AI will create valuable results. Change management is a key component to preparing workforces for the changes you need to adopt AI. Fostering a culture of readiness and continuous compliance is key to ensuring it becomes an asset.

Knowing your business objectives and making sure your AI strategy aligns with and contributes to them is key to maximising its capabilities. Whether improving customer experiences, automating repetitive tasks, or personalising services is your business goal, use AI to drive that strategy.

Prioritising AI applications that solve real problems as well as boosting productivity is key to boosting business growth. Do you need help with recommending products to your customers to increase sales? This is a tangible problem AI can solve for you. Like following a gingerbread recipe, baking a strategic AI plan will produce the best goods.

Santa’s secret weapon – Responsible AI

Long-term success is the outcome of adopting AI through responsible practices and with ethical guidelines in mind. High-quality data aligned business goals, and a prepped workforce are the key to thriving rather than falling behind.

If Santa’s already on board, why aren’t you? After all, it’s how he gets his presents from the North Pole to under your tree.

Get onto the nice list this Christmas – start small, think big, and stay responsible.

Kyle Hill is chief technology officer at ANS, a digital transformation provider and Microsoft’s UK Services Partner of the Year 2024. Headquartered in Manchester, it offers public and private cloud, security, business applications, low code, and data services to thousands of customers, from enterprise to SMB and public sector organisations.

Source

Posted on

Schwarz Group partners with Google on EU sovereign cloud

Google has partnered with retail giant Schwarz Group to deliver what the pair claim is truly secure and sovereign cloud-based collaboration for German and European regulated industries.

Through the partnership, Schwarz Group’s StackIT, the cloud provider for the retailer, which operates as an independent company offering sovereign cloud capabilities, will provide client-side encryption of customers’ Google Workspace data.

StackIT said customers’ data will remain resident within the European Union (EU), with full redundancy offered by backups hosted solely in its European datacentres to meet customer demands around data protection, data residency and data resiliency.

“Germany and the EU have until now lacked enterprise-grade cloud collaboration solutions that fully address the sovereignty requirements of regulated industries, including ensuring all data is secured and backed up on local soil with absolutely no opportunity for access by foreign nations or platform providers,” said Rolf Schumann, co-CEO of Schwarz Digits, the IT and digital division of the Schwarz Group.

“Our partnership and new offering with Google Cloud will fill this gap with an entirely new business model.”

Client-side encryption means Google has no access to customers’ data. According to Schwarz and Google, this safeguards the sovereignty of not only Schwarz Group, but also all customers who value the independence of their operations, giving them full confidence that their data is always in their control.

“This new partnership will enable the companies of Schwarz Group to combine its leadership in digital transformation with Google Cloud’s strengths in productivity, collaboration and security, enabled by our cutting-edge AI,” said Sundar Pichai, CEO of Google and Alphabet. “Together, we are opening up a world of new, sovereign opportunities for European organisations to innovate and build on our joint solutions, accelerating a new era of innovation.”

Through the partnership, Google Cloud’s security will be integrated with those of XM Cyber, Schwarz Digits’ hybrid cloud security company. This integrated offering will then be distributed to customers via the Google Cloud Marketplace.

According to Google and Schwarz, this integrated security will help German and European organisations, particularly those in highly regulated industries, raise the bar on their enterprise and multi-cloud security. In addition, XM Cyber’s Continuous Exposure Management will be embedded into the sovereign Google Workspace office productivity suite offered to European enterprises.

“This partnership changes the game for regulated industry players in Europe by removing the sovereignty and security concerns that often hold back more ambitious adoption of the cloud for productivity and collaboration,” said Thomas Kurian, CEO of Google Cloud. “Our alliance with companies of Schwarz Group will enable entire industries in Europe to deliver digital innovation with security and compliance at its core.”

Schwarz Group is Europe’s largest retailer, and the fourth-largest in the world. The company plans to transition its global office workforce to Google Workspace. The partnership with Google, according to Schwarz Group, enables critical workplace data to be protected against third-party access including foreign government institutions, and also transferred to alternate service providers if needed.

“Switching to Google Workspace is an important step for us out of legacy and into innovative, efficient and future-proof cloud-based collaboration,” said Christian Müller, Co-CEO of Schwarz Digits. “Google Workspace is the most secure and reliable productivity platform in the industry today, and we expect our organisation-wide migration to have significant flow-on benefits to all areas of operations from simplifying IT management to rendering our point-of-sale workflows significantly more efficient.”

Source

Posted on

Storage technology explained: Flash vs HDD

The past 12 months saw flash storage nudge into areas from which it had hitherto been absent. In particular, this was because of the availability of denser – and therefore cheaper per-gigabyte (GB) – quad-level cell (QLC) flash storage into array markets and use cases that were once considered nearline.

Alongside this, we saw the price-per-GB of flash drop towards the level of spinning disk hard disk drives (HDDs) then rebound rapidly as memory manufacturers chased profitability. Meanwhile, the keenest of flash storage advocates predicted the demise of the hard drive and the imminent victory of the all-flash datacentre.

In this article, we define enterprise flash storage, look into its QLC and triple-level cell (TLC) variants, the benefits of non-volatile memory express (NVMe) flash, and examine the pros and cons of flash versus HDD in terms of cost, performance, flash in the cloud, and the likelihood (or otherwise) of the all-flash datacentre.

What is enterprise flash storage?

Enterprise flash storage refers to systems that comprise multiple flash drives housed in datacentre rack-mounted array form factor products.

In enterprise flash storage arrays, the capacity of many drives is aggregated, with access to storage media governed by controller hardware.

The controller is compute that powers the intelligence needed to handle input/output (I/O) from hosts to the storage, decision-making over allocation of data to media, but also in flash arrays to carry out maintenance tasks such as wear levelling, garbage collection, and so on.

Enterprise flash storage array capacities run from tens of terabytes (TB) to many petabytes (PB). As with HDD-based arrays, access to storage can be block (for performance-hungry database use cases, for example), file (for general use and unstructured data) or object (for unstructured data also).

What is QLC flash storage?

QLC is the latest generation of flash storage media. QLC stands for quad-level cell. That means that every cell in the flash chip can store four bits of data using 16 states.

That means it can store more data in the same space than TLC flash, which is also widely available. Previously widely available were single-level cell (SLC) flash and multi-level cell (MLC, meaning two states), but these have been largely superseded now.

At the start of 2024, most enterprise storage arrays are built with TLC drives for general-purpose and mission-critical use cases. But QLC has edged into the mainstream and gained traction for unstructured data workloads, in particular with key enterprise storage array makers adding QLC-based products in the past year or so.

As manufacturers increase the number of possible states per cell, storage density increases and the cost of storage per GB decreases. But, as storage density increases in terms of cell capacity, issues can arise that can limit the endurance of flash media.

What is NVMe flash?

Non-volatile memory express (NVMe) is a protocol developed especially for use with flash storage. Prior to NVMe, flash drives used transport protocols that originated during the HDD era, namely Serial Advanced Technology Attachment (SATA) and Serial-Attached SCSI (SAS). In fact, these are still in use and arrays that use drives with such connectivity (2.5in and 3.5in form factor) are sold by the big storage suppliers.

But NVMe is at the forefront now for flash drive performance. NVMe’s key innovation was to optimise queues and buffers for use with flash, which improved performance many times over.

As a follow-on, suppliers then developed ways of allowing NVMe connectivity across physically more distant connections across the datacentre. Such NVMe-over-fabrics technologies include the ability to carry NVMe via Ethernet, Infiniband, TCP, RDMA (ie, memory-to-memory connectivity) and more.

What is HDD?

Hard disk drives (HDDs) that rely on magnetic read/write heads and mechanically spinning disks have been around for decades, with flash a competitor that has emerged in the past 10 years or so.

As with flash, HDDs can be aggregated into datacentre rack-mounted array products and the capacity of multiple drives pooled for enterprise users. In fact, HDD-based arrays long preceded enterprise flash arrays and are still widely used.  

What’s the difference in performance between flash and HDD?

When we look at flash versus disk, the key thing that stands out is that flash is fast – many times faster than spinning disk HDD.

Flash drives offer lower latency, with access times down to low milliseconds, or even microseconds, compared with the multiple milliseconds of spinning disk, particularly for reads. That means enterprise flash can also offer vastly more input/output operations per second (IOPS) when aggregated into a storage array.

In throughput terms, flash offers gigabit-per-second (Gbps) rates four or five times quicker than HDD.

Such rapidity has been the key draw for enterprise flash storage and is a result of the lack of moving parts. With spinning platters, HDD is limited by physics in ways that solid-state storage is not.

In terms of capacities, HDD is available in up to around 22TB units. And while some flash drives have been marketed that run to 60-plus terabytes, they generally come in smaller sizes, but part of that is because of cost. 

What’s the cost difference between flash and HDD?

In terms of per-GB cost at drive level, flash costs more than spinning disk.

Flash prices spiked significantly in late 2023 and the early months of 2024 as manufacturers throttled back production in an effort to raise prices and achieve profitability.

Solid-state drive (SSD) prices per gigabyte reached an average of $0.095/GB by April 2024, which was a rise of 26.67% since autumn 2023.

But, flash drive prices then fell steadily over the first three quarters of 2024 to an average of $0.085 per gigabyte (GB) in September 2024.

In October 2023, flash had averaged $0.075/GB while HDD averaged $0.05/GB for SAS and $0.035/GB for SATA drives.

Average spinning disk (SAS and SATA) hard drive prices held steady during the six months to September 2024 at $0.039 per gigabyte. That figure was $0.041/GB in early April.

For a customer that planned to deploy 20TB of flash, based on those prices, it would have cost $1,500 in October 2023, $1,900 in April 2024, and $1,700 in September 2024. That compares to the equivalent for spinning disk of $850 in October 2023 and $780 in September 2024.

Will flash kill HDD? How much longer for HDD?

In particular, Pure Storage has declared HDDs will be dead by 2028, with its flash products the chief agent in the cull, and all owing to its ability to aggregate much more flash capacity on its proprietary modules than occurs on commodity flash drives.

With flash module sizes of up to 300TB by 2026 promised by Pure, it contends that spinning disk will be commercially unviable.

Meanwhile, companies such as Panasas, which specialises in storage for unstructured data, point to hyperscaler datacentres’ overwhelming use of spinning disk in ratios up to 90/10 against flash. Panasas argues that there’s still a five-times differential between the lowest-cost flash and HDD, and that for most, something like the hyperscaler solution is optimal.  

When can you use flash and HDD in the cloud?

Enterprise users can also specify flash storage and spinning disk in the cloud. It is more likely in most cases that cloud storage will be specified by performance and cost criteria, in which case the customer may never know what media underlies it.

But it is possible also to specify flash storage in the cloud and the three largest hyperscalers – Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP) – have solid-state storage options that mix cost, capacity and performance. 

The hyperscalers all offer flash storage to support compute with service levels based on capacity and IOPS per volume that range from general-purpose to premium levels aimed at specific workloads (eg, SQL, Oracle, SAP Hana) and environments (eg, Windows, Lustre, MacOS).

There are also options aimed at flash for file storage and flash storage from named suppliers, such as Azure’s NetApp Files.

What is the all-flash datacentre?

For about a decade, the idea of the all-flash datacentre has been discussed. The all-flash datacentre replaces HDD and other media such as tape with flash storage.

Driving it is the continued decrease in the cost of flash storage – as with QLC flash – but also the advantages of flash in terms of rapid access. The latter becomes more relevant as customers want to run analytics on bigger subsets of their data.

So, for example, where backups may previously have been held on nearline media such as slower HDDs, advocates of flash for such use cases point to the ability to run artificial intelligence (AI) on large customer datasets and to gain value therefrom.

Also, with backups as an example, the idea of being able to recover quickly from flash media in case of a ransomware attack is another use case touted by all-flash datacentre boosters. 

When will the all-flash datacentre arrive?

While enthusiastic suppliers of flash storage such as Pure talk down the obstacles to the all-flash datacentre, analysts point to the spread of (especially QLC) flash into secondary workloads but not necessarily all use cases, with spinning disk likely to retain its usefulness for some time for some datasets.

Meanwhile, HDD suppliers such as Toshiba say around 85% of all data is still on spinning disk. That fact, it says, is not likely to change rapidly, not least because the flash capacity to replace it doesn’t exist.

Source