Quantcast
Channel: Kommentare zu: Struggling with Advanced Format during a LVM to RAID migration
Viewing all articles
Browse latest Browse all 4

Von: madduck

$
0
0
Well, the idea is really the same on all layers. The reason why you want your partitions aligned is so that when you write a block at a higher level, at most one (group of) block(s) gets written on a lower level.<br /><br />Advanced format means that the drive exposes 512b blocks, but whenever you write one, it actually updates a 4k block, rewriting the other 7 512b blocks unchanged. This is nothing new, memory has done this forever (e.g. writing a boolean (1 bit) has pretty much always meant writing a whole word (16, 32, or nowadays often 64 bits).<br />Compilers have been grown really good at aligning memory writes, and one has to go through lengths to trick them into doing something else (__unaligned etc.).<br /><br />With disks, we are only now really starting to get into this issue, and it will get worse. The layer on top of the device usually works with larger blocks than 512b. For instance, a filesystem might have 4k blocks, while LVM might even use 4M blocks ("PE size").<br /><br />Now imagine you have a file of size 33Kb, sitting on a standard ext4 filesystem. This file will occupy ceil(33/4)=9 filesystem blocks. If you write this file, the filesystem will successively write these blocks to the underlying device. With 512b blocks, there's hardly a problem, but if the disk works with 4k blocks, then imagine what happens if the first filesystem block starts at sector 7: to write the first filesystem block, the disk needs to write sectors 7–11, which means writing sector groups 0–7 and 8–15. To write the second filesystem block, the disk needs to write sector groups 8–15 and 16–23, and so on. The problem gets worse when the filesystem cannot allocate sequential blocks, because then, no driver or device can be smart enough to consolidate the writes. In short: twice the amount of work is needed. And if this is not enough, consider LVM writing a 4M extent (for every filesystem block updated) to an unaligned RAID6, meaning that all four disks have to be written with data and parity data, …<br /><br />So therein lies the importance to ensure that the 4k filesystem block is exactly aligned with the 4k device block. Aligning partitions is the first step, but now you need to ensure that every layer plays nice with the one underneath. The problem here are the metadata. As far as I can tell, filesystems squash their metadata into blocks as well, so there should not be a problem once you've aligned the filesystems like you did, using the partition table.<br /><br />However, layers like MD and LVM are more obscure. For instance, MD uses 32k superblocks, but then adds an additional 2 bytes for each device. In addition, the different metadata versions put the superblocks at different position on the device (see md_superblock_formats.txt in the mdadm doc directory). And then there is LVM…<br /><br />I have to find a solution to all of this (WD drives + GPT + MD + LVM + dmcrypt + ext4) within the next 10 days, so watch <a href="http://madduck.net/blog/…" rel="nofollow"><a href="http://madduck.net/blog/…" rel="nofollow">http://madduck.net/blog/…</a></a>

Viewing all articles
Browse latest Browse all 4

Latest Images

Trending Articles





Latest Images