Following its independence from its parent company, Toshiba Memory is now Kioxia. With enterprise workloads moving away from HDDs, the role flash plays will change as dramatically.
Toshiba Memory has completed its rebranding to Kioxia, following the completion of its 2017 spin-off from Toshiba proper—a move prompted by Toshiba’s financial troubles that resulted from its purchase of Westinghouse Electric. Toshiba’s PC OEM business adopted the preexisting DynaBook name in its rebranding, while Kioxia was created by combining the Japanese word kioku (meaning “memory”) and the Greek word axia (meaning “value,”).
Kioxia’s bona fides in the flash market are well established, as NOR and NAND flash were invented at Toshiba in 1984, and 1987, respectively. Given flash’s ubiquity in consumer electronics—removable media such as SD cards, as well eMMCs in phones and SSDs in computers—the flash storage business is burgeoning, and Kioxia’s recent independence from Toshiba is likely intended to instill confidence in their future.
SEE: Top five on-premises cloud storage options (free PDF) (TechRepublic)
Jeremy Werner, senior vice president and general manager of the SSD business unit at Kioxia, spoke with TechRepublic about what the next decade holds for Kioxia.
TechRepublic: Kioxia is pursuing the storage class memory (SCM) market with XL-FLASH, introduced in August, and that will be available first in SSDs, before it’s being offered as an NVDIMM. What SSD form factors is XL-FLASH going to be available in?
Jeremy Werner: At Flash Memory Summit, we had a number of third party SSD manufacturers—because Kioxia is an enabler for companies who build all kinds of products, including their own solid-state drives, and in that context, we provide them with the memory and technical support. We also build our own solid-state drive we’re selling to various markets.
There were about 10 different companies that announced demonstrations or support for XL-FLASH in their SSDs, in various form factors. For our internal products, what we showed at Flash Memory Summit was a M.2 22110 proof-of-concept drive, and that’s what we are now sampling to customers for evaluation.
TechRepublic: What workloads would benefit from XL-FLASH over traditional, mainstream NAND?
Werner: As of today, the percentage of persistent flip-switch storage class memory that XL-FLASH fits into—compared to traditional flash in server and storage applications—is less than 0.1%. If you look at where we are today, even though there’s been a lot of promotion of this technology by the industry as a whole, it’s really at the very beginning of adoption. Mostly what we see is a total system cost optimization with tiering of different types of memory.
If you have, for instance, a large database, and an access profile across that database, your hottest data would be in DRAM, then hot data would be in SSDs, and if it’s a very large database, you might put cold or archive data in a hard drive.
XL-FLASH fits in from a total cost perspective of that system. Depending on the access profile, and the predictability of the data accesses, XL-FLASH can fit in to the hierarchy between DRAM and traditional SSDs, often times actually reducing the amount of DRAM required by the system, and delivering a similar overall system performance at a lower cost.
TechRepublic: For enterprise-targeted products, the common difficulty is that M.2 is not hot-swappable. What form factors are being pursued by Kioxia for hot-swappable deployments like all-flash arrays?
Werner: The logical common form factor there would be a 2.5″ drive. We announced two products [at Flash Memory Summit], CM6 and CD6, which were the first publicly demonstrated PCIe 4.0 enterprise and datacenter SSDs. Those were in U.3, 2.5″ form factors, and U.3 is fully backwards compatible into a U.2 bay, but also allows you to put the drive into a U.3 universal tri-mode bay.
TechRepublic: With the transition to PCI Express 4.0, there is increasing concern about heat dissipation on SSDs. With increased speeds, more heat is being generated, resulting in potentially reduced performance due to thermal throttling. How is this being addressed in designs by Kioxia?
Werner: For the most part, we’re able to cool at typical air temperatures and airflow speeds we see in the data center, even in PCIe 4.0, in a 2.5″ drive, but you’re right that it is getting more challenging. In the [PCIe 4.0] timeframe, the problem is really starting to emerge clearly, by PCIe 5.0, it will fully emerge.
One of the things that we’ve been doing is working with customers on new form factors. A lot of focus has been around EDSFF… which is more optimized for flash, it can support much higher power delivery for higher performance. [EDSFF] is laid out in a way to enable greater system density than taking the old 2.5″ form factor, and the design allows better thermal transfer and heat dissipation.
Migrating form factors is a major investment, and not something that system designers want to do, because the 2.5″ form factor is still readily available. Some of these new form factors are harder to get what you need [in volume] or multi-source, and the 2.5″ form factor is good enough for a lot of people’s system designs. That’s why so many people have stuck with it.
If systems drop support for hard drives and SATA and go NVMe flash attach only, these new form factors become attractive. We’re starting to see the trend towards them, but mass adoption of these new form factors will begin in 2021 and continue through 2025. We expect the 2.5″ form factor still has at least another five or six good years ahead of it.
TechRepublic: So, 2.5″ is a form factor that you’re inheriting from traditional hard drives. Some of the orthodoxy that applied there isn’t necessarily applicable with flash. One of those is RAID, which can wear down SSDs faster due to write amplification, among other effects.
What strategies can you apply from a manufacturing side, on the controller or firmware, to mitigate this type of issue? What should customers do to avoid issues in deployment?
Werner: We have internal parity protection in all of our enterprise and data center drives that can do things like recover a failed NAND die. This type of redundancy doesn’t exist in a hard drive at all today. If you look at the failure rates of SSDs—forget the specs for a second, but what we actually see in the field—our enterprise and data center SSD field failure rates are something like one-tenth of what you see in hard drives.When you take into account internal recovery mechanisms and look at probabilities, maybe a system that needed RAID 6 now can go to RAID 5.
If you look at cloud-native infrastructure, and how databases typically deploy in the cloud, for a number of reasons, those databases are using triple replication across the cloud.
When you go to cloud-native orchestrated infrastructure and scale out applications, there’s not a need [for] RAID, because it’s already being replicated at the application level. We’ve been working on software that enables network storage in a disaggregated fashion for cloud-native users called KumoScale. This integrates with the application and the orchestration layers, to take advantage of replications that already exist, to not waste space where you already have adequate levels of protection in the cloud stack.
TechRepublic: Do you think that cloud-native deployments will eventually result in a post-RAID mindset for enterprise data storage?
Werner: I think there’s already people who are realizing things in a post-RAID enterprise data storage world, but there’s still people running COBOL applications through their data center. To migrate every application over to new architectures and things like that takes a very, very long time.
There’s still, in my view, 20 years runway where some people are going to need RAID. Certainly, for cloud-native architecture and new applications, when you use flash and modern software stacks, you can get away from the need for RAID while still having your data protected.
TechRepublic: Kioxia previously announced their research into penta-level-cell (PLC) NAND, a topic which has seen renewed interest as Intel announced they are evaluating PLC. There’s a lot of takes about this on the internet, some claim that PLC will lead to denser, cheaper drives for client devices. What do you think the future holds for PLC?
Werner: One of the challenges with taking new, more dense, lower cost memories into the client space is… the sweet spot for storage in a corporate notebook is about 256 GB. For consumers, it’s 512 GB. A leading PLC edge die for us is a 512 gigabit die, in production. Take the 256 GB SSD, you’re down to just four die. You still want to deliver a lot of performance out of those die, but you don’t necessarily have a lot of parallelism in an SSD. As we go to QLC and PLC to really take advantage of more dense storage, companies will want to increase the capacity of the die, otherwise you’re paying for a lot of logic overhead on the die.
If you continue down that path, unless we see the sweet spot—and clients—move up significantly in capacity, it becomes difficult to deliver the same kind of performance if you only have one or two die in an SSD. For less performance sensitive, low density applications, let’s say surveillance cameras that are just writing sequentially in a loop, that might make sense.
Those make the most sense initially for what I’ll call warm or hot read, but colder write applications. Imagine Instagram—people post a lot of pictures. People aren’t overwriting a lot of the data that they store online, in the cloud, in any of these applications, but many people visit it, reading it over and over and over.
You want very fast read access to that data—there is a lot of data if it is pictures or videos, that generates a lot of bits that need to be stored reliably and consistently—but it’s not overwritten very frequently. Flash is very, very good in terms of read performance compared to other media. I terms of reads per dollar, it’s probably the most effective cost technology in the world. Because of the relatively low endurance cycles—because you’re not overwriting that stuff—that’s where I could see QLC and PLC really taking off in the near term.
TechRepublic: It’s no secret that Mark Zuckerberg is going around saying Facebook needs very cheap flash that does not need to withstand a lot of writes for exactly that type of workload. In terms of the number of write cycles, does it make sense to you for the idea of having bare QLC or PLC in client devices?
Werner: We think QLC technology in client devices will make sense in the future, but today we think that TLC is still the most effective for client SSDs.