SSD write amplification and endurance
Summary:
How is endurance of SSDs measured and what is the impact of write amplifications?
Details
I have an Intel SSD Pro 2500 Series 480 GB drive. Recently I got a notification that my drive is failing. I was surprised because it is less than 2 years old. I used the Intel SSD utility to check the SMART status and see results shown in screen shot.
A few things stand out:
E8 available reserved space is low. This is probably why I got a notification saying the disk was failing.
Total Host write is about 8.9 TB. That's a lot, but not an unreasonable amount.
Total NAND writes is about 202 TB. That's a lot more and I think it's close to the endurance of the drive.
I looked at the drive manual and the specs say:
Minimum Useful Life/Endurance Rating
The SSD will have a minimum of five years of useful life under typical client workloads of up to 20 GB of host writes per day.
By my calculation that means the drive support 20 GB * 365 day/year * 5 years = 36.5 TB of writes.
So my 8.9 TB of host writes is well under the 36.5 TB threshold, but the 202 TB of NAND writes is well above that threshold.
My first question is: Are drive endurance figures based on host writes or NAND writes?
I assume it is NAND writes since that's what the drive is actually doing, but if anyone has more concrete answer that would be useful.
My second question is: What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
I checked to see if trim is enabled and I think it is:
C:WINDOWSsystem32>fsutil behavior query DisableDeleteNotify
NTFS DisableDeleteNotify = 0
I found one white paper that indicates this is a normal level of write amplification:
The write
amplification factor on many consumer SSDs is anywhere from 15 to 20.
But I've also seen other things that indicate write amplification should be closer to 1. So that's the reason I'm asking here to see if anyone has useful insight.
Addendum: BitLocker Question
In the process of writing this question I found a review of my drive. Part of the review says:
The TRIM issue has not changed. Again it is not a problem unless you use software encryption because otherwise there will always be compressible data, but given the Opal and eDrive support in the Pro 2500, I do not see why anyone would opt for the Pro 2500 if the plan is to utilize software encryption.
I don't know exactly what the TRIM issue is, but my drive does have encryption enabled with Bitlocker (part of IT policies). I searched to see how to check if the drive uses built-in encryption and it seems that my drive is not using hardware encryption. Running the command manage-bde.exe -status c:
shows:
C:WINDOWSsystem32>manage-bde.exe -status c:
BitLocker Drive Encryption: Configuration Tool version 10.0.14393
Copyright (C) 2013 Microsoft Corporation. All rights reserved.
Volume C: [Windows]
[OS Volume]
Size: 445.05 GB
BitLocker Version: 2.0
Conversion Status: Fully Encrypted
Percentage Encrypted: 100.0%
Encryption Method: AES 256
Protection Status: Protection On
Lock Status: Unlocked
Identification Field: Unknown
Key Protectors:
TPM And PIN
Numerical Password
My guess now is that the Sandforce controller is not working well with uncompressible encrypted data. Still if anyone has more detailed answers to my questions they are appreciated.
ssd
add a comment |
Summary:
How is endurance of SSDs measured and what is the impact of write amplifications?
Details
I have an Intel SSD Pro 2500 Series 480 GB drive. Recently I got a notification that my drive is failing. I was surprised because it is less than 2 years old. I used the Intel SSD utility to check the SMART status and see results shown in screen shot.
A few things stand out:
E8 available reserved space is low. This is probably why I got a notification saying the disk was failing.
Total Host write is about 8.9 TB. That's a lot, but not an unreasonable amount.
Total NAND writes is about 202 TB. That's a lot more and I think it's close to the endurance of the drive.
I looked at the drive manual and the specs say:
Minimum Useful Life/Endurance Rating
The SSD will have a minimum of five years of useful life under typical client workloads of up to 20 GB of host writes per day.
By my calculation that means the drive support 20 GB * 365 day/year * 5 years = 36.5 TB of writes.
So my 8.9 TB of host writes is well under the 36.5 TB threshold, but the 202 TB of NAND writes is well above that threshold.
My first question is: Are drive endurance figures based on host writes or NAND writes?
I assume it is NAND writes since that's what the drive is actually doing, but if anyone has more concrete answer that would be useful.
My second question is: What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
I checked to see if trim is enabled and I think it is:
C:WINDOWSsystem32>fsutil behavior query DisableDeleteNotify
NTFS DisableDeleteNotify = 0
I found one white paper that indicates this is a normal level of write amplification:
The write
amplification factor on many consumer SSDs is anywhere from 15 to 20.
But I've also seen other things that indicate write amplification should be closer to 1. So that's the reason I'm asking here to see if anyone has useful insight.
Addendum: BitLocker Question
In the process of writing this question I found a review of my drive. Part of the review says:
The TRIM issue has not changed. Again it is not a problem unless you use software encryption because otherwise there will always be compressible data, but given the Opal and eDrive support in the Pro 2500, I do not see why anyone would opt for the Pro 2500 if the plan is to utilize software encryption.
I don't know exactly what the TRIM issue is, but my drive does have encryption enabled with Bitlocker (part of IT policies). I searched to see how to check if the drive uses built-in encryption and it seems that my drive is not using hardware encryption. Running the command manage-bde.exe -status c:
shows:
C:WINDOWSsystem32>manage-bde.exe -status c:
BitLocker Drive Encryption: Configuration Tool version 10.0.14393
Copyright (C) 2013 Microsoft Corporation. All rights reserved.
Volume C: [Windows]
[OS Volume]
Size: 445.05 GB
BitLocker Version: 2.0
Conversion Status: Fully Encrypted
Percentage Encrypted: 100.0%
Encryption Method: AES 256
Protection Status: Protection On
Lock Status: Unlocked
Identification Field: Unknown
Key Protectors:
TPM And PIN
Numerical Password
My guess now is that the Sandforce controller is not working well with uncompressible encrypted data. Still if anyone has more detailed answers to my questions they are appreciated.
ssd
NoteDisableDeleteNotify
only tells you that that Windows will send TRIM if it's supported by the disk - it doesn't guarantee that disk is willing to accept it (see superuser.com/questions/145697/… for details ). Having said that I'd be surprised an Intel SSD didn't support TRIM. Additionally a user comment in forums.sandisk.com/t5/All-Other-SanDisk-SSD/… says endurance is apparently relative to host writes.
– Anon
Jun 17 '18 at 19:24
add a comment |
Summary:
How is endurance of SSDs measured and what is the impact of write amplifications?
Details
I have an Intel SSD Pro 2500 Series 480 GB drive. Recently I got a notification that my drive is failing. I was surprised because it is less than 2 years old. I used the Intel SSD utility to check the SMART status and see results shown in screen shot.
A few things stand out:
E8 available reserved space is low. This is probably why I got a notification saying the disk was failing.
Total Host write is about 8.9 TB. That's a lot, but not an unreasonable amount.
Total NAND writes is about 202 TB. That's a lot more and I think it's close to the endurance of the drive.
I looked at the drive manual and the specs say:
Minimum Useful Life/Endurance Rating
The SSD will have a minimum of five years of useful life under typical client workloads of up to 20 GB of host writes per day.
By my calculation that means the drive support 20 GB * 365 day/year * 5 years = 36.5 TB of writes.
So my 8.9 TB of host writes is well under the 36.5 TB threshold, but the 202 TB of NAND writes is well above that threshold.
My first question is: Are drive endurance figures based on host writes or NAND writes?
I assume it is NAND writes since that's what the drive is actually doing, but if anyone has more concrete answer that would be useful.
My second question is: What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
I checked to see if trim is enabled and I think it is:
C:WINDOWSsystem32>fsutil behavior query DisableDeleteNotify
NTFS DisableDeleteNotify = 0
I found one white paper that indicates this is a normal level of write amplification:
The write
amplification factor on many consumer SSDs is anywhere from 15 to 20.
But I've also seen other things that indicate write amplification should be closer to 1. So that's the reason I'm asking here to see if anyone has useful insight.
Addendum: BitLocker Question
In the process of writing this question I found a review of my drive. Part of the review says:
The TRIM issue has not changed. Again it is not a problem unless you use software encryption because otherwise there will always be compressible data, but given the Opal and eDrive support in the Pro 2500, I do not see why anyone would opt for the Pro 2500 if the plan is to utilize software encryption.
I don't know exactly what the TRIM issue is, but my drive does have encryption enabled with Bitlocker (part of IT policies). I searched to see how to check if the drive uses built-in encryption and it seems that my drive is not using hardware encryption. Running the command manage-bde.exe -status c:
shows:
C:WINDOWSsystem32>manage-bde.exe -status c:
BitLocker Drive Encryption: Configuration Tool version 10.0.14393
Copyright (C) 2013 Microsoft Corporation. All rights reserved.
Volume C: [Windows]
[OS Volume]
Size: 445.05 GB
BitLocker Version: 2.0
Conversion Status: Fully Encrypted
Percentage Encrypted: 100.0%
Encryption Method: AES 256
Protection Status: Protection On
Lock Status: Unlocked
Identification Field: Unknown
Key Protectors:
TPM And PIN
Numerical Password
My guess now is that the Sandforce controller is not working well with uncompressible encrypted data. Still if anyone has more detailed answers to my questions they are appreciated.
ssd
Summary:
How is endurance of SSDs measured and what is the impact of write amplifications?
Details
I have an Intel SSD Pro 2500 Series 480 GB drive. Recently I got a notification that my drive is failing. I was surprised because it is less than 2 years old. I used the Intel SSD utility to check the SMART status and see results shown in screen shot.
A few things stand out:
E8 available reserved space is low. This is probably why I got a notification saying the disk was failing.
Total Host write is about 8.9 TB. That's a lot, but not an unreasonable amount.
Total NAND writes is about 202 TB. That's a lot more and I think it's close to the endurance of the drive.
I looked at the drive manual and the specs say:
Minimum Useful Life/Endurance Rating
The SSD will have a minimum of five years of useful life under typical client workloads of up to 20 GB of host writes per day.
By my calculation that means the drive support 20 GB * 365 day/year * 5 years = 36.5 TB of writes.
So my 8.9 TB of host writes is well under the 36.5 TB threshold, but the 202 TB of NAND writes is well above that threshold.
My first question is: Are drive endurance figures based on host writes or NAND writes?
I assume it is NAND writes since that's what the drive is actually doing, but if anyone has more concrete answer that would be useful.
My second question is: What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
I checked to see if trim is enabled and I think it is:
C:WINDOWSsystem32>fsutil behavior query DisableDeleteNotify
NTFS DisableDeleteNotify = 0
I found one white paper that indicates this is a normal level of write amplification:
The write
amplification factor on many consumer SSDs is anywhere from 15 to 20.
But I've also seen other things that indicate write amplification should be closer to 1. So that's the reason I'm asking here to see if anyone has useful insight.
Addendum: BitLocker Question
In the process of writing this question I found a review of my drive. Part of the review says:
The TRIM issue has not changed. Again it is not a problem unless you use software encryption because otherwise there will always be compressible data, but given the Opal and eDrive support in the Pro 2500, I do not see why anyone would opt for the Pro 2500 if the plan is to utilize software encryption.
I don't know exactly what the TRIM issue is, but my drive does have encryption enabled with Bitlocker (part of IT policies). I searched to see how to check if the drive uses built-in encryption and it seems that my drive is not using hardware encryption. Running the command manage-bde.exe -status c:
shows:
C:WINDOWSsystem32>manage-bde.exe -status c:
BitLocker Drive Encryption: Configuration Tool version 10.0.14393
Copyright (C) 2013 Microsoft Corporation. All rights reserved.
Volume C: [Windows]
[OS Volume]
Size: 445.05 GB
BitLocker Version: 2.0
Conversion Status: Fully Encrypted
Percentage Encrypted: 100.0%
Encryption Method: AES 256
Protection Status: Protection On
Lock Status: Unlocked
Identification Field: Unknown
Key Protectors:
TPM And PIN
Numerical Password
My guess now is that the Sandforce controller is not working well with uncompressible encrypted data. Still if anyone has more detailed answers to my questions they are appreciated.
ssd
ssd
edited Jun 17 '18 at 4:44
Gabriel Southern
asked Jun 16 '18 at 6:47
Gabriel SouthernGabriel Southern
2751413
2751413
NoteDisableDeleteNotify
only tells you that that Windows will send TRIM if it's supported by the disk - it doesn't guarantee that disk is willing to accept it (see superuser.com/questions/145697/… for details ). Having said that I'd be surprised an Intel SSD didn't support TRIM. Additionally a user comment in forums.sandisk.com/t5/All-Other-SanDisk-SSD/… says endurance is apparently relative to host writes.
– Anon
Jun 17 '18 at 19:24
add a comment |
NoteDisableDeleteNotify
only tells you that that Windows will send TRIM if it's supported by the disk - it doesn't guarantee that disk is willing to accept it (see superuser.com/questions/145697/… for details ). Having said that I'd be surprised an Intel SSD didn't support TRIM. Additionally a user comment in forums.sandisk.com/t5/All-Other-SanDisk-SSD/… says endurance is apparently relative to host writes.
– Anon
Jun 17 '18 at 19:24
Note
DisableDeleteNotify
only tells you that that Windows will send TRIM if it's supported by the disk - it doesn't guarantee that disk is willing to accept it (see superuser.com/questions/145697/… for details ). Having said that I'd be surprised an Intel SSD didn't support TRIM. Additionally a user comment in forums.sandisk.com/t5/All-Other-SanDisk-SSD/… says endurance is apparently relative to host writes.– Anon
Jun 17 '18 at 19:24
Note
DisableDeleteNotify
only tells you that that Windows will send TRIM if it's supported by the disk - it doesn't guarantee that disk is willing to accept it (see superuser.com/questions/145697/… for details ). Having said that I'd be surprised an Intel SSD didn't support TRIM. Additionally a user comment in forums.sandisk.com/t5/All-Other-SanDisk-SSD/… says endurance is apparently relative to host writes.– Anon
Jun 17 '18 at 19:24
add a comment |
2 Answers
2
active
oldest
votes
In an Anandtech SSD review there is a section titled Endurance Ratings: How They Are Calculated. Within it, the following calculation is given:
So when you see "host writes" that's a reference to what the OS sends down not the real NAND writes that have to be done to fulfil them.
Are drive endurance figures based on host writes or NAND writes?
In this case host writes. Since the manual said this:
"The SSD will have a minimum of five years of useful life under typical client workloads with up to 20 GB of host writes per day." [emphasis added]
there's no ambiguity - the statement is referring to host writes not NAND writes. Note it's a fluffy statement because it says things like "typical". In this case we don't know what the true maximum NAND write value is because it hasn't been stated in the manual.
The Tom's Hardware article Intel Clarifies 600p SSD Endurance Limitations, But TBW Ratings Can Be Misleading says you can't safely estimate when your drive will die based on Drive Writes Per Day/TeraBytes Written and you should only depend on the Media Wearout Indicator (MWI) value. This makes sense because it's an estimate dependent on your writes conforming to a particular model. The article also states a drive will function beyond the MWI reaching its final value so long as you have spare cells (but it looks like you're running low on those).
What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
An Intel forum post by an Intel employee suggests a Write Amplification Factor (WAF) of 1 - 4 is "normal" but that it may be has high as 10. A WAF of 22.7 is likely higher than the average but ultimately the value is going to be highly situational and as you pointed out, because of encryption the SSD won't see much compressible data.
A Microsoft Understanding SSD endurance blog post says there can be lots of different reasons for amplification:
repair jobs generate additional IO; data deduplication generates additional IO; the filesystem, and many other components, generate additional IO by persisting their metadata and log structures; etc. In fact, the drive itself generates write amplification from internal activities such as garbage collection!
At the end of the day unless you're going to change your data workload (by making it sequential and highly compressible) options for reducing the WAF are limited. Assuming things like partitions are well aligned about the only thing you could do is manual over-provisioning by emptying the entire SSD with a secure erase (which itself will contribute to the SSD's wear) and then ensuring all the partitions you create don't cover the full size of the SSD by some percentage (thus creating artificial spares). From Intel's Solid-State Drives in Server Storage Applications paper:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY
ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the
remaining capacity unused.
Additional references:
- https://www.samsung.com/us/business/oem-solutions/pdfs/SSD-Sales-Presentation.pdf
- https://en.wikipedia.org/wiki/Write_amplification
add a comment |
For most consumer workloads, write amplification is typically no more than about 2x. 22.7x is far higher write amplification than normal and is often indicative of a problem. Unfortunately, that problem lies in your SSD's controller itself.
The unusually high write amplification you're seeing is caused by the combination of full-disk encryption on a SandForce-based SSD and a faulty TRIM implementation on the SandForce SF-2281 with firmware versions 5.0.1 and 5.0.2.
Full-disk encryption interacts poorly with the data compression technology used by SandForce controllers
SandForce SSD controllers are well-known for relying on data compression (branded as DuraWrite) to increase SSD endurance. In essence, the DuraWrite feature compresses data as it is sent to the controller and writes the compressed data to the NAND, and decompresses it when it is read. The idea is that user data is often compressible, and taking advantage of this allows the controller to write less data to the NAND than is actually sent to the drive. As such, with the right workloads, SandForce-based drives are among the few that can achieve a write amplification factor of less than one.
However, encrypted data can't be effectively compressed, so this breaks down. This reliance on compression means that when faced with incompressible data, SandForce controllers suffer from lower performance and gain no endurance benefit from the DuraWrite feature.
TRIM does not work properly on SandForce SF-2281 firmware versions 5.0.1 and 5.0.2
A key limitation of NAND flash memory is that while it can be written to in small pages, existing data can't be rewritten in place, must first be erased, which can only be done in whole blocks of several dozen pages. Writes must also be spread out to prevent any single area of the drive from wearing out prematurely (wear leveling). Furthermore, drives must assume that all data previously written to the drive is still valid until the OS tells it otherwise via the TRIM command. As a result, an SSD that appears to be much less than full to the operating system may be internally close to full. You're more likely to see high write amplification if the drive gets lots of small random writes when it's internally full; essentially, the drive has to erase whole blocks and rewrite significant amounts of existing data when you're actually trying to write small bits and pieces of information to the drive.
Unfortunately, the SF-2281 controller used in the Intel SSD Pro 2500 (and a number of other SSDs at the time) shipped with buggy firmware that caused TRIM to not work properly. As a result, the controller is not able to properly perform garbage collection and needs to rewrite data previously written to the drive repeatedly, even if it is no longer valid. While BitLocker does support TRIM, it is effectively useless on this drive. As a result, the drive effectively behaves as if it's always completely full, resulting in very high write amplification.
What can I do about this?
Given the SMART status of the drive, you should replace it as soon as possible. Modern SSDs generally have functioning TRIM and will not suffer from this problem. As far as I can tell, the Intel SSD Pro 2500 never received a firmware update to address this problem.
This issue could have been mitigated by overprovisioning it so that less than the full disk area is actually being used. Unfortunately, because TRIM does not work, simply shrinking the system partition will not help because the freed area can't be trimmed to tell the drive that it can be used as spare area. You would need to wipe the whole drive using the secure erase command, partition the drive appropriately, and never use the unpartitioned/unformatted space for storing any actual data. Indeed, this is already mentioned in Anon's answer:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the remaining capacity unused.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1331770%2fssd-write-amplification-and-endurance%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
In an Anandtech SSD review there is a section titled Endurance Ratings: How They Are Calculated. Within it, the following calculation is given:
So when you see "host writes" that's a reference to what the OS sends down not the real NAND writes that have to be done to fulfil them.
Are drive endurance figures based on host writes or NAND writes?
In this case host writes. Since the manual said this:
"The SSD will have a minimum of five years of useful life under typical client workloads with up to 20 GB of host writes per day." [emphasis added]
there's no ambiguity - the statement is referring to host writes not NAND writes. Note it's a fluffy statement because it says things like "typical". In this case we don't know what the true maximum NAND write value is because it hasn't been stated in the manual.
The Tom's Hardware article Intel Clarifies 600p SSD Endurance Limitations, But TBW Ratings Can Be Misleading says you can't safely estimate when your drive will die based on Drive Writes Per Day/TeraBytes Written and you should only depend on the Media Wearout Indicator (MWI) value. This makes sense because it's an estimate dependent on your writes conforming to a particular model. The article also states a drive will function beyond the MWI reaching its final value so long as you have spare cells (but it looks like you're running low on those).
What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
An Intel forum post by an Intel employee suggests a Write Amplification Factor (WAF) of 1 - 4 is "normal" but that it may be has high as 10. A WAF of 22.7 is likely higher than the average but ultimately the value is going to be highly situational and as you pointed out, because of encryption the SSD won't see much compressible data.
A Microsoft Understanding SSD endurance blog post says there can be lots of different reasons for amplification:
repair jobs generate additional IO; data deduplication generates additional IO; the filesystem, and many other components, generate additional IO by persisting their metadata and log structures; etc. In fact, the drive itself generates write amplification from internal activities such as garbage collection!
At the end of the day unless you're going to change your data workload (by making it sequential and highly compressible) options for reducing the WAF are limited. Assuming things like partitions are well aligned about the only thing you could do is manual over-provisioning by emptying the entire SSD with a secure erase (which itself will contribute to the SSD's wear) and then ensuring all the partitions you create don't cover the full size of the SSD by some percentage (thus creating artificial spares). From Intel's Solid-State Drives in Server Storage Applications paper:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY
ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the
remaining capacity unused.
Additional references:
- https://www.samsung.com/us/business/oem-solutions/pdfs/SSD-Sales-Presentation.pdf
- https://en.wikipedia.org/wiki/Write_amplification
add a comment |
In an Anandtech SSD review there is a section titled Endurance Ratings: How They Are Calculated. Within it, the following calculation is given:
So when you see "host writes" that's a reference to what the OS sends down not the real NAND writes that have to be done to fulfil them.
Are drive endurance figures based on host writes or NAND writes?
In this case host writes. Since the manual said this:
"The SSD will have a minimum of five years of useful life under typical client workloads with up to 20 GB of host writes per day." [emphasis added]
there's no ambiguity - the statement is referring to host writes not NAND writes. Note it's a fluffy statement because it says things like "typical". In this case we don't know what the true maximum NAND write value is because it hasn't been stated in the manual.
The Tom's Hardware article Intel Clarifies 600p SSD Endurance Limitations, But TBW Ratings Can Be Misleading says you can't safely estimate when your drive will die based on Drive Writes Per Day/TeraBytes Written and you should only depend on the Media Wearout Indicator (MWI) value. This makes sense because it's an estimate dependent on your writes conforming to a particular model. The article also states a drive will function beyond the MWI reaching its final value so long as you have spare cells (but it looks like you're running low on those).
What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
An Intel forum post by an Intel employee suggests a Write Amplification Factor (WAF) of 1 - 4 is "normal" but that it may be has high as 10. A WAF of 22.7 is likely higher than the average but ultimately the value is going to be highly situational and as you pointed out, because of encryption the SSD won't see much compressible data.
A Microsoft Understanding SSD endurance blog post says there can be lots of different reasons for amplification:
repair jobs generate additional IO; data deduplication generates additional IO; the filesystem, and many other components, generate additional IO by persisting their metadata and log structures; etc. In fact, the drive itself generates write amplification from internal activities such as garbage collection!
At the end of the day unless you're going to change your data workload (by making it sequential and highly compressible) options for reducing the WAF are limited. Assuming things like partitions are well aligned about the only thing you could do is manual over-provisioning by emptying the entire SSD with a secure erase (which itself will contribute to the SSD's wear) and then ensuring all the partitions you create don't cover the full size of the SSD by some percentage (thus creating artificial spares). From Intel's Solid-State Drives in Server Storage Applications paper:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY
ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the
remaining capacity unused.
Additional references:
- https://www.samsung.com/us/business/oem-solutions/pdfs/SSD-Sales-Presentation.pdf
- https://en.wikipedia.org/wiki/Write_amplification
add a comment |
In an Anandtech SSD review there is a section titled Endurance Ratings: How They Are Calculated. Within it, the following calculation is given:
So when you see "host writes" that's a reference to what the OS sends down not the real NAND writes that have to be done to fulfil them.
Are drive endurance figures based on host writes or NAND writes?
In this case host writes. Since the manual said this:
"The SSD will have a minimum of five years of useful life under typical client workloads with up to 20 GB of host writes per day." [emphasis added]
there's no ambiguity - the statement is referring to host writes not NAND writes. Note it's a fluffy statement because it says things like "typical". In this case we don't know what the true maximum NAND write value is because it hasn't been stated in the manual.
The Tom's Hardware article Intel Clarifies 600p SSD Endurance Limitations, But TBW Ratings Can Be Misleading says you can't safely estimate when your drive will die based on Drive Writes Per Day/TeraBytes Written and you should only depend on the Media Wearout Indicator (MWI) value. This makes sense because it's an estimate dependent on your writes conforming to a particular model. The article also states a drive will function beyond the MWI reaching its final value so long as you have spare cells (but it looks like you're running low on those).
What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
An Intel forum post by an Intel employee suggests a Write Amplification Factor (WAF) of 1 - 4 is "normal" but that it may be has high as 10. A WAF of 22.7 is likely higher than the average but ultimately the value is going to be highly situational and as you pointed out, because of encryption the SSD won't see much compressible data.
A Microsoft Understanding SSD endurance blog post says there can be lots of different reasons for amplification:
repair jobs generate additional IO; data deduplication generates additional IO; the filesystem, and many other components, generate additional IO by persisting their metadata and log structures; etc. In fact, the drive itself generates write amplification from internal activities such as garbage collection!
At the end of the day unless you're going to change your data workload (by making it sequential and highly compressible) options for reducing the WAF are limited. Assuming things like partitions are well aligned about the only thing you could do is manual over-provisioning by emptying the entire SSD with a secure erase (which itself will contribute to the SSD's wear) and then ensuring all the partitions you create don't cover the full size of the SSD by some percentage (thus creating artificial spares). From Intel's Solid-State Drives in Server Storage Applications paper:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY
ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the
remaining capacity unused.
Additional references:
- https://www.samsung.com/us/business/oem-solutions/pdfs/SSD-Sales-Presentation.pdf
- https://en.wikipedia.org/wiki/Write_amplification
In an Anandtech SSD review there is a section titled Endurance Ratings: How They Are Calculated. Within it, the following calculation is given:
So when you see "host writes" that's a reference to what the OS sends down not the real NAND writes that have to be done to fulfil them.
Are drive endurance figures based on host writes or NAND writes?
In this case host writes. Since the manual said this:
"The SSD will have a minimum of five years of useful life under typical client workloads with up to 20 GB of host writes per day." [emphasis added]
there's no ambiguity - the statement is referring to host writes not NAND writes. Note it's a fluffy statement because it says things like "typical". In this case we don't know what the true maximum NAND write value is because it hasn't been stated in the manual.
The Tom's Hardware article Intel Clarifies 600p SSD Endurance Limitations, But TBW Ratings Can Be Misleading says you can't safely estimate when your drive will die based on Drive Writes Per Day/TeraBytes Written and you should only depend on the Media Wearout Indicator (MWI) value. This makes sense because it's an estimate dependent on your writes conforming to a particular model. The article also states a drive will function beyond the MWI reaching its final value so long as you have spare cells (but it looks like you're running low on those).
What are reasonable values for write amplification on Windows client workloads and is my 22.7 times write amplification high? If so how can I reduce it?
An Intel forum post by an Intel employee suggests a Write Amplification Factor (WAF) of 1 - 4 is "normal" but that it may be has high as 10. A WAF of 22.7 is likely higher than the average but ultimately the value is going to be highly situational and as you pointed out, because of encryption the SSD won't see much compressible data.
A Microsoft Understanding SSD endurance blog post says there can be lots of different reasons for amplification:
repair jobs generate additional IO; data deduplication generates additional IO; the filesystem, and many other components, generate additional IO by persisting their metadata and log structures; etc. In fact, the drive itself generates write amplification from internal activities such as garbage collection!
At the end of the day unless you're going to change your data workload (by making it sequential and highly compressible) options for reducing the WAF are limited. Assuming things like partitions are well aligned about the only thing you could do is manual over-provisioning by emptying the entire SSD with a secure erase (which itself will contribute to the SSD's wear) and then ensuring all the partitions you create don't cover the full size of the SSD by some percentage (thus creating artificial spares). From Intel's Solid-State Drives in Server Storage Applications paper:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY
ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the
remaining capacity unused.
Additional references:
- https://www.samsung.com/us/business/oem-solutions/pdfs/SSD-Sales-Presentation.pdf
- https://en.wikipedia.org/wiki/Write_amplification
edited Jun 18 '18 at 5:01
answered Jun 17 '18 at 19:15
AnonAnon
54248
54248
add a comment |
add a comment |
For most consumer workloads, write amplification is typically no more than about 2x. 22.7x is far higher write amplification than normal and is often indicative of a problem. Unfortunately, that problem lies in your SSD's controller itself.
The unusually high write amplification you're seeing is caused by the combination of full-disk encryption on a SandForce-based SSD and a faulty TRIM implementation on the SandForce SF-2281 with firmware versions 5.0.1 and 5.0.2.
Full-disk encryption interacts poorly with the data compression technology used by SandForce controllers
SandForce SSD controllers are well-known for relying on data compression (branded as DuraWrite) to increase SSD endurance. In essence, the DuraWrite feature compresses data as it is sent to the controller and writes the compressed data to the NAND, and decompresses it when it is read. The idea is that user data is often compressible, and taking advantage of this allows the controller to write less data to the NAND than is actually sent to the drive. As such, with the right workloads, SandForce-based drives are among the few that can achieve a write amplification factor of less than one.
However, encrypted data can't be effectively compressed, so this breaks down. This reliance on compression means that when faced with incompressible data, SandForce controllers suffer from lower performance and gain no endurance benefit from the DuraWrite feature.
TRIM does not work properly on SandForce SF-2281 firmware versions 5.0.1 and 5.0.2
A key limitation of NAND flash memory is that while it can be written to in small pages, existing data can't be rewritten in place, must first be erased, which can only be done in whole blocks of several dozen pages. Writes must also be spread out to prevent any single area of the drive from wearing out prematurely (wear leveling). Furthermore, drives must assume that all data previously written to the drive is still valid until the OS tells it otherwise via the TRIM command. As a result, an SSD that appears to be much less than full to the operating system may be internally close to full. You're more likely to see high write amplification if the drive gets lots of small random writes when it's internally full; essentially, the drive has to erase whole blocks and rewrite significant amounts of existing data when you're actually trying to write small bits and pieces of information to the drive.
Unfortunately, the SF-2281 controller used in the Intel SSD Pro 2500 (and a number of other SSDs at the time) shipped with buggy firmware that caused TRIM to not work properly. As a result, the controller is not able to properly perform garbage collection and needs to rewrite data previously written to the drive repeatedly, even if it is no longer valid. While BitLocker does support TRIM, it is effectively useless on this drive. As a result, the drive effectively behaves as if it's always completely full, resulting in very high write amplification.
What can I do about this?
Given the SMART status of the drive, you should replace it as soon as possible. Modern SSDs generally have functioning TRIM and will not suffer from this problem. As far as I can tell, the Intel SSD Pro 2500 never received a firmware update to address this problem.
This issue could have been mitigated by overprovisioning it so that less than the full disk area is actually being used. Unfortunately, because TRIM does not work, simply shrinking the system partition will not help because the freed area can't be trimmed to tell the drive that it can be used as spare area. You would need to wipe the whole drive using the secure erase command, partition the drive appropriately, and never use the unpartitioned/unformatted space for storing any actual data. Indeed, this is already mentioned in Anon's answer:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the remaining capacity unused.
add a comment |
For most consumer workloads, write amplification is typically no more than about 2x. 22.7x is far higher write amplification than normal and is often indicative of a problem. Unfortunately, that problem lies in your SSD's controller itself.
The unusually high write amplification you're seeing is caused by the combination of full-disk encryption on a SandForce-based SSD and a faulty TRIM implementation on the SandForce SF-2281 with firmware versions 5.0.1 and 5.0.2.
Full-disk encryption interacts poorly with the data compression technology used by SandForce controllers
SandForce SSD controllers are well-known for relying on data compression (branded as DuraWrite) to increase SSD endurance. In essence, the DuraWrite feature compresses data as it is sent to the controller and writes the compressed data to the NAND, and decompresses it when it is read. The idea is that user data is often compressible, and taking advantage of this allows the controller to write less data to the NAND than is actually sent to the drive. As such, with the right workloads, SandForce-based drives are among the few that can achieve a write amplification factor of less than one.
However, encrypted data can't be effectively compressed, so this breaks down. This reliance on compression means that when faced with incompressible data, SandForce controllers suffer from lower performance and gain no endurance benefit from the DuraWrite feature.
TRIM does not work properly on SandForce SF-2281 firmware versions 5.0.1 and 5.0.2
A key limitation of NAND flash memory is that while it can be written to in small pages, existing data can't be rewritten in place, must first be erased, which can only be done in whole blocks of several dozen pages. Writes must also be spread out to prevent any single area of the drive from wearing out prematurely (wear leveling). Furthermore, drives must assume that all data previously written to the drive is still valid until the OS tells it otherwise via the TRIM command. As a result, an SSD that appears to be much less than full to the operating system may be internally close to full. You're more likely to see high write amplification if the drive gets lots of small random writes when it's internally full; essentially, the drive has to erase whole blocks and rewrite significant amounts of existing data when you're actually trying to write small bits and pieces of information to the drive.
Unfortunately, the SF-2281 controller used in the Intel SSD Pro 2500 (and a number of other SSDs at the time) shipped with buggy firmware that caused TRIM to not work properly. As a result, the controller is not able to properly perform garbage collection and needs to rewrite data previously written to the drive repeatedly, even if it is no longer valid. While BitLocker does support TRIM, it is effectively useless on this drive. As a result, the drive effectively behaves as if it's always completely full, resulting in very high write amplification.
What can I do about this?
Given the SMART status of the drive, you should replace it as soon as possible. Modern SSDs generally have functioning TRIM and will not suffer from this problem. As far as I can tell, the Intel SSD Pro 2500 never received a firmware update to address this problem.
This issue could have been mitigated by overprovisioning it so that less than the full disk area is actually being used. Unfortunately, because TRIM does not work, simply shrinking the system partition will not help because the freed area can't be trimmed to tell the drive that it can be used as spare area. You would need to wipe the whole drive using the secure erase command, partition the drive appropriately, and never use the unpartitioned/unformatted space for storing any actual data. Indeed, this is already mentioned in Anon's answer:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the remaining capacity unused.
add a comment |
For most consumer workloads, write amplification is typically no more than about 2x. 22.7x is far higher write amplification than normal and is often indicative of a problem. Unfortunately, that problem lies in your SSD's controller itself.
The unusually high write amplification you're seeing is caused by the combination of full-disk encryption on a SandForce-based SSD and a faulty TRIM implementation on the SandForce SF-2281 with firmware versions 5.0.1 and 5.0.2.
Full-disk encryption interacts poorly with the data compression technology used by SandForce controllers
SandForce SSD controllers are well-known for relying on data compression (branded as DuraWrite) to increase SSD endurance. In essence, the DuraWrite feature compresses data as it is sent to the controller and writes the compressed data to the NAND, and decompresses it when it is read. The idea is that user data is often compressible, and taking advantage of this allows the controller to write less data to the NAND than is actually sent to the drive. As such, with the right workloads, SandForce-based drives are among the few that can achieve a write amplification factor of less than one.
However, encrypted data can't be effectively compressed, so this breaks down. This reliance on compression means that when faced with incompressible data, SandForce controllers suffer from lower performance and gain no endurance benefit from the DuraWrite feature.
TRIM does not work properly on SandForce SF-2281 firmware versions 5.0.1 and 5.0.2
A key limitation of NAND flash memory is that while it can be written to in small pages, existing data can't be rewritten in place, must first be erased, which can only be done in whole blocks of several dozen pages. Writes must also be spread out to prevent any single area of the drive from wearing out prematurely (wear leveling). Furthermore, drives must assume that all data previously written to the drive is still valid until the OS tells it otherwise via the TRIM command. As a result, an SSD that appears to be much less than full to the operating system may be internally close to full. You're more likely to see high write amplification if the drive gets lots of small random writes when it's internally full; essentially, the drive has to erase whole blocks and rewrite significant amounts of existing data when you're actually trying to write small bits and pieces of information to the drive.
Unfortunately, the SF-2281 controller used in the Intel SSD Pro 2500 (and a number of other SSDs at the time) shipped with buggy firmware that caused TRIM to not work properly. As a result, the controller is not able to properly perform garbage collection and needs to rewrite data previously written to the drive repeatedly, even if it is no longer valid. While BitLocker does support TRIM, it is effectively useless on this drive. As a result, the drive effectively behaves as if it's always completely full, resulting in very high write amplification.
What can I do about this?
Given the SMART status of the drive, you should replace it as soon as possible. Modern SSDs generally have functioning TRIM and will not suffer from this problem. As far as I can tell, the Intel SSD Pro 2500 never received a firmware update to address this problem.
This issue could have been mitigated by overprovisioning it so that less than the full disk area is actually being used. Unfortunately, because TRIM does not work, simply shrinking the system partition will not help because the freed area can't be trimmed to tell the drive that it can be used as spare area. You would need to wipe the whole drive using the secure erase command, partition the drive appropriately, and never use the unpartitioned/unformatted space for storing any actual data. Indeed, this is already mentioned in Anon's answer:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the remaining capacity unused.
For most consumer workloads, write amplification is typically no more than about 2x. 22.7x is far higher write amplification than normal and is often indicative of a problem. Unfortunately, that problem lies in your SSD's controller itself.
The unusually high write amplification you're seeing is caused by the combination of full-disk encryption on a SandForce-based SSD and a faulty TRIM implementation on the SandForce SF-2281 with firmware versions 5.0.1 and 5.0.2.
Full-disk encryption interacts poorly with the data compression technology used by SandForce controllers
SandForce SSD controllers are well-known for relying on data compression (branded as DuraWrite) to increase SSD endurance. In essence, the DuraWrite feature compresses data as it is sent to the controller and writes the compressed data to the NAND, and decompresses it when it is read. The idea is that user data is often compressible, and taking advantage of this allows the controller to write less data to the NAND than is actually sent to the drive. As such, with the right workloads, SandForce-based drives are among the few that can achieve a write amplification factor of less than one.
However, encrypted data can't be effectively compressed, so this breaks down. This reliance on compression means that when faced with incompressible data, SandForce controllers suffer from lower performance and gain no endurance benefit from the DuraWrite feature.
TRIM does not work properly on SandForce SF-2281 firmware versions 5.0.1 and 5.0.2
A key limitation of NAND flash memory is that while it can be written to in small pages, existing data can't be rewritten in place, must first be erased, which can only be done in whole blocks of several dozen pages. Writes must also be spread out to prevent any single area of the drive from wearing out prematurely (wear leveling). Furthermore, drives must assume that all data previously written to the drive is still valid until the OS tells it otherwise via the TRIM command. As a result, an SSD that appears to be much less than full to the operating system may be internally close to full. You're more likely to see high write amplification if the drive gets lots of small random writes when it's internally full; essentially, the drive has to erase whole blocks and rewrite significant amounts of existing data when you're actually trying to write small bits and pieces of information to the drive.
Unfortunately, the SF-2281 controller used in the Intel SSD Pro 2500 (and a number of other SSDs at the time) shipped with buggy firmware that caused TRIM to not work properly. As a result, the controller is not able to properly perform garbage collection and needs to rewrite data previously written to the drive repeatedly, even if it is no longer valid. While BitLocker does support TRIM, it is effectively useless on this drive. As a result, the drive effectively behaves as if it's always completely full, resulting in very high write amplification.
What can I do about this?
Given the SMART status of the drive, you should replace it as soon as possible. Modern SSDs generally have functioning TRIM and will not suffer from this problem. As far as I can tell, the Intel SSD Pro 2500 never received a firmware update to address this problem.
This issue could have been mitigated by overprovisioning it so that less than the full disk area is actually being used. Unfortunately, because TRIM does not work, simply shrinking the system partition will not help because the freed area can't be trimmed to tell the drive that it can be used as spare area. You would need to wipe the whole drive using the secure erase command, partition the drive appropriately, and never use the unpartitioned/unformatted space for storing any actual data. Indeed, this is already mentioned in Anon's answer:
Important: The SSD must be new, fresh out-of-the-box or must be erased using the ATA SECURITY ERASE UNIT command immediately before adjusting the usable capacity.
- (Recommended) Create partition(s) that occupy only the desired usable capacity and leave the remaining capacity unused.
edited Jan 28 at 4:43
answered Jan 28 at 2:21
bwDracobwDraco
36.9k37137177
36.9k37137177
add a comment |
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1331770%2fssd-write-amplification-and-endurance%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Note
DisableDeleteNotify
only tells you that that Windows will send TRIM if it's supported by the disk - it doesn't guarantee that disk is willing to accept it (see superuser.com/questions/145697/… for details ). Having said that I'd be surprised an Intel SSD didn't support TRIM. Additionally a user comment in forums.sandisk.com/t5/All-Other-SanDisk-SSD/… says endurance is apparently relative to host writes.– Anon
Jun 17 '18 at 19:24