c - Memory alignment on modern processors? -
I often see the code such as the following, when representing larger bitmaps in memory:
Width of Size_T = 1280; Size_t height = 800; Size_t bytesPerPixel = 3; Size_t bytewidth = ((width * byte pepperix) + 3) & amp; ~ 3; / * 4 bytes * / uint8_t * pixeladata = algorithm (bititude = height);
(i.e., a bitmap that is aligned with a certain number of bytes in a batwidth
, most commonly 4) allocated in the form of the nearest block of memory is done.
A point is placed on the image:
pixelडेटा + (byteudith * y) + (bytepixel * x)
This clue gives me two questions:
- Does this align a buffer like this, which affects the modern processor? Should I be worried about alignment, or would the compiler want to handle it?
- If it has any effect, can anyone point me to a resource to find ideal byte alignment for different processors?
Thank you.
It depends on many factors if you have access to pixel data at only one byte at a time , The alignment will not create the vast majority of time. To read / write a byte of data, most processors will not completely care whether the byte is on a 4-byte range or not.
However, if you are accessing data in more than one unit byte (say, in 2-byte or 4-byte units), you will definitely see the alignment effect. Some processors (such as many RISC processors) are completely illegal to access unrecognized data at some levels: an attempt to read a 4-byte word from the address that is not aligned 4-byte, a data access exception (Or data collection will create exceptions)), for example on a PowerPC
On other processors (such as x86), unauthorized access to the address is allowed, but it often comes with hidden performance penalties. Memory load / store is often applied in microcode, and microd will detect unrecognized access. Typically, microcode will bring a fair 4-byte volume from memory, but if it is not a coalition, then it will have to bring 4 to byte space two from memory and recreate the desired 4-byte quantity. Suitable bytes of locations Obtaining two storage space is obviously slow to one.
This is just for simple load and store, though. Some instructions, such as MMX or SSE instructions are in the set, require that their memory operands are properly aligned. If you try to reach unexplained memory using those specific instructions, you will see something like an illegal instruction exception.
In short, I do not worry too much about alignment unless you write super performance -critical code (in example assembly) compiler helps you a lot, eg By padding structures so that 4-byte quantities are aligned on the 4-byte boundaries, and on x86, the CPU helps you when working with unprotected access. Since the pixel data you are working with is in the amount of 3 bytes, so you are almost always accessing a byte.
If you decide that you want to access pixel in singular 4-byte access (unlike 3 1-byte access), it would be better to use 32-bit pixels and each individual pixel would be 4 -Byout will be aligned on the boundary. Aligning each row in a 4-byte range, but less in each pixel, if any, will not have the effect.
Based on your code, I am convinced that it is related to reading the Windows bitmap file format - bitmap files require a scalable length of 4 bytes to be of value, so your pixels Setting up data buffers with that property is the property that you can read in the entire bitmap of the same (of course, you still have to deal with the fact that the scanline is top-to-bottom The bottom-to-top is stored at the location and pixel data is BGR instead of RGB.) This is not really a great advantage, though - it's not too hard to read a bitmap in a scanline at a time.
Comments
Post a Comment