Chunky-to-Planar for Dummies: An in-depth tutorial by The Paranoid / Paradox

From Atari Wiki
Jump to navigation Jump to search

The following file can be found on https://alive.atari.org/alive8/c2p.php.

:
:
:
:   Tutorial:
:   
:   
:    Chunky-to-Planar for Dummies
:
:
:    An in-depth tutorial by The Paranoid
:
:


Chapter One:  The Prologue
--------------------------
You know what Alive lacked the past few issues ? Some tutorial.
Maggie often featured it, UCM did and Alive did as well.
However, as less and less contributors volunteer to write for
diskmagazines, and as the makers of demos have less and less
time and are usually pushed by diskmag-editors to release more
and more demos of higher and higher quality in less and less
time, they usually don't feel very motivated to share their
wisdom on diskmages.
I am probably not a very skilled coder and also not very long
a coder at all, but the chunky-to-planar-, short c2p-technique
is on the one hand so simply, on the other so very essential for
coding new-school effects on the Atari ST, i thought it might be
the time for another tutorial, even though Ray and Ultra have
already covered the parts in which it got described and optimized.
Many thanks to them for that, i learned quite a lot from you and
you should be credited here in any way for making this tutorial
possible at all.


Chapter Two: Why Bitplanes ?
-----------------------------
Probably you might consider it interesting why and how computer
graphics evolved and why the Atari ST bears such an indirect
graphics format as interleaved bitplanes.
In the beginning, the generation of computer graphics was very
closely related to the hardware that actually generated the
video signal and the result was called block graphics. The
cathode beam of the TV started to draw a line from the left
to the right and at the very start of a line, it triggered a
clock for a hardware that was capable of changing the video
signal's output colour at every tick of the clock. Now, the
building of one line on screen by the TV costs a certain
time. And if the video hardware of the computer system is,
let's say, 10 times as fast as the TV, it can change the
video signal's colour 10 times in the time that the TV takes
to draw a line - meaning, it could display 10 blocks a line:


    ---------------------------------------> t
TV: ---------------------------------------> building a line
Com:---|---|---|---|---|---|---|---|---|---|

       | indicates a colour change


Now, to crunch a few numbers, let's take PAL for example. This
vide format sports a horizontal frequency of 15KHz. If your
computer's video clock is now, for example, 600KHz, it would
be 40 times the speed of the video clock, allowing 40 colour
changes per line and thus allowing a horizontal resolution of
40 pixels per line - Which is exactly what the playfield-
generator of the Atari 2600 is capable of.
The information which colour which pixel has is usually
directly stored in the video generator. Some later systems
had this information in the main memory, the RAM. In this
case, the video processor would directly read the main
memory to fetch each pixel.

However, physical resolutions of 40 or 80 pixels a line is
probably not really suitable for word processing or spread-
sheets. And the solution are character block graphics.
Character block graphics have been very popular in 8-bit
home computers and video games. The graphic processor does
not read pixel information directly but rather the number
of a specific character block which is supposed to be
displayed this line, then looks up the specific pixel
information of this character block in the character
list and displays this character:

       --------------------------------------> t
Video  | Reads character block number 0
       | Looks up Character
       | Displays first line of character
        | Reads character block number 1
        | Looks up Character
        | Displays first line of character
         | ...

The main advantage of this graphic system is the low usage
of RAM, because in the early days of computer generated
graphics, RAM was very expensive while ROM was fairly cheap.
Character Block graphics allow to have only the information
which character appears where in the RAM while the character
information itself can be stored in the RAM. The video memory
then only points to the character which should be displayed
there and the video memory can be fairly small then: At a
resolution of 320 x 200 pixels in total, if each pixel would
be one byte in size, the video memory would be 64.000 bytes
and therefore exceed the memory of most 8-bit machines. Using
character block graphics however, with each character 8 x 8
pixels large, the system would display 40 x 25 characters in
total, needing a mere 1000 bytes in the RAM.
However, character block graphics generators still work at
rather high clockrates. To generate 320 pixels a line, the
video generator must run at a clockspeed 320 times the speed
of the TV horizontal rate. For the PAL system, this would be
4800KHz or 4.8MHz! And as most video generators of this time
also produce a so-called border, a blank space between the
real screen content and the boundary of the physical cathode
ray tube, the clockrates were usually even higher.

Again, character block graphics have one major disadvantage:
Pixels can never be accessed individually on screen. Even
if the character set is used from RAM where it could be
modified, a character set is limited in size, usually 256
characters, and that does not allow to cover the whole
screen when wanting to display each character only once.
To freely access any pixel on screen, bitmap graphics
were introduced. Also, colour information is usually
reduced. Usually, not any pixel of a character can have
an individual colour but has limits, like, a character
can have only one colour or one fore-ground colour per
character and 3 background colours for all characters.

Bitmap graphics grew popular as the computers gained main
memory and did not need the divertion using character blocks
anymore. In Bitmap graphics, each pixel's colour information
is directly stored in RAM. But the word bitmap already suggests
that this colour information does not need to be stored
comprehensively but can be stored in distributed, single-bit
informations, which brings us to planar graphics.
In plane graphics, each pixel's colour information is stored
in different planes. Meaning, the graphics memory does not
exist once but in as many planes as there are bits needed
per pixel. If you have, for example, a computer capable of
displaying 32 colours at once, it would require 5 bits per
pixel (5 bits can encode number 0 to 31). Then, graphic
information would be stored as 5 planes of 1 bit depth each.
The graphic processor would read all 5 planes at the same
time, but read these bit by bit and assemble the colour
information for each pixel:

           Bpl 0     Bpl 1    Bpl 2    Bpl 3    Bpl 4
Memory:   0 1 1 0   1 0 1 1  0 1 1 0  1 1 1 1  0 1 0 1
          | | | |___|_|_|_|__|_|_|_|__|_|_|_|__|_|_|_|___ Pixel 0
          | | |_____|_|_|____|_|_|____|_|_|____|_|_|_____ Pixel 1
          | |_______|_|______|_|______|_|______|_|_______ Pixel 2
          |_________|________|________|________|_________ Pixel 3

On screen, the first 4 pixels would then get the colour:

             Pixel 0    Pixel 1   Pixel 2   Pixel 3
              01010      10111     11110     01011

Resulting colour 10        23        30        11

To underline the way this format works it should be noted
that the graphic memory is organized in several screen
memories of single-bit information, meaning: the most
significant bit of a byte represents the left-most pixel,
let's call it the first, the next bit represents the second
pixel and so forth. Only the existence of multiple planes make
this format really capable of displaying multi-colour screens
instead of "monochrome" (2 coloured) ones.
Our beloved Atari ST bears another special feature. The bitplanes
do not need to be completely separated in memory, meaning that
each plane is compact in memory. Planes could be organized, for
example, in single lines, like the Amiga graphic chip does:
In memory, the first line of the first bitplane is followed by
the first line of the second bitplane, is followed by the first
line of the third bitplane and so forth until the second line of
the first bitplane comes, which is again followed by the second
line of the second bitplane etc.
The Atari ST however has interleaved bitplanes: The first word
of graphics memory describes the first 16 pixels on screen in
the first bitplane, the second word describes the same 16 pixels
in the second bitplane and so forth. If the Atari ST displays 16
colours on screen, meaning 4 bitplanes, you have 8 bytes (4 words)
of data which describe 16 pixels in all 4 bitplanes.

The reason for bitplane graphics is that at the time these
computers were made, it was the most sensible format. The graphic
chips were not yet fast enough to combine 8 or even 16 bits of
graphic data per pixel, yet 3 or 5 bits per pixel are hard to
arrange in memory. The interleaved bitplane format of the Atari
ST is fairly easy to handle in hardware: The video shifter, the
graphics hardware of the Atari ST, reads 4 words from memory
per access, then each bit serves one switch to access the
correct palette register:
                              0----[ Palette Register 0]
                        +----[3]
                        |0    1----[ Palette Register 1]
                  +----[2]
                  |     |1    0----[ Palette Register 2]
                  |     +----[3]
                  |0          1----[ Palette Register 3]
            +----[1]
            |     |1          0----[ Palette Register 4]
            |     |     +----[3]
            |     |     |0    1----[ Palette Register 5]
            |     +----[2]
            |           |1    0----[ Palette Register 6]
            |0          +----[3]
4 Bit       |                 1----[ Palette Register 7]
pixel -----[0]
info.       |                 0----[ Palette Register 8]
            |1          +----[3]
            |           |0    1----[ Palette Register 9]
            |     +----[2]
            |     |     |1    0----[ Palette Register 10]
            |     |     +----[3]
            |     |0          1----[ Palette Register 11]
            +----[1]
                  |1          0----[ Palette Register 12]
                  |     +----[3]
                  |     |0    1----[ Palette Register 13]
                  +----[2]
                        |1    0----[ Palette Register 14]
                        +----[3]
                              1----[ Palette Register 15]

The number in brackets represents the bit which is being
checked to decide which branch to take. The number behind/
above/under the number in brackets denotes the path to be
taken, if the bit had the denoted value. The content of
the palette register is then related to a certain video
signal output to display the regarding colour on screen.

However, again, bitplane graphics have a major disadvantage:
It's slow to access single pixel information, as single pixel
operations are always kind of slowish. If you want to change
the colour of one single pixel, you have to modify 4 words
in memory, and if you require to do it flexibly, you need
to mask out the bit in each word (AND) and to write the new
value for this bit (OR), resulting in 8 operations in total
to change a mere pixel.



Chapter Three: And what is chunky ?
-------------------------------------

The internals of the graphic processor as displayed in the
diagram above is suitable for 4 or 5, maybe even 6 bitplanes,
resulting in 16, 32 or 64 colours, but for 8 or probably even
16 bit per pixel, it is not very suitable anymore. Also,
changing pixel information with 16 bit colourdepth in plane
mode wold require to read, modify and write 16 words. Needing
two operations per word, it would require 32 operations, just
to change the colour of one single pixel.
This is where graphic system developers decided to do a serious
"cut" and re-engineer the whole way of graphic generation. The
graphics generator would now read one or two, maybe 4 pixels
at a time and then display these, using an intrinsic technique
for the used colour model. To now quickly read one or two
pixels from memory, it would also be very unwise the have the
graphics chip read at different locations and assemble single-
bit information. Therefore, the whole pixel information is
stored as dense as possible - meaning, for 16-bit per bit,
in 1 16-bit word. And this is nicknamed "chunky" mode.
If we now suppose this 16-bit information can be separated
into 5 bit information for the red-percentage of the pixel,
6 bit green and 5 bit blue (RGB format 565), the graphic
processor would (probably) work the following way:

                                 +---------+
                                 |         |
                           5-Bit | Decoder |       Analogue
              +-----------[RRRRR]|  logic  |------ Output
              |             Red  |         |       Pin Red
              |                  +---------+
              |                  +---------+
              |                  |         |
16 Bit        |            6-Bit | Decoder |       Analogue
pixel -----[SPLIT]-------[GGGGGG]|  logic  |------ Output
info.         |            Green |         |       Pin Green
              |                  +---------+
              |                  +---------+
              |                  |         |
              |            5-Bit | Decoder |       Analogue
              +-----------[BBBBB]|  logic  |------ Output
                            Blue |         |       Pin Blue
                                 +---------+

The 16-bit pixel information is separated into red, green and
blue block, interpreted separately by a decoder logic which
can either be a simple D/A-converter or some fixed hardware
which has a fixed output level for each of the possible red,
green or blue values, then the signal is being put out on
three separate channels.

The main advantage of this video format is that single pixel
information is easily accessible. To change a colour of a
single pixel requires just one memory access. No read-modify-
write of a whole set of pixels is necessary anymore. The
disadvantage is obvious as well. The video controller has to
read every single pixel just as well and if the video memory
is located in the main memory, this will interrupt the CPU
quite often.
An example. At a horizontal resolution of 640 bytes a line
and 16-bit colour depth, a line would consist of 1280 Bytes.
One line on a VGA monitor is updated at 35KHz refresh. If
there were no blank times, this would mean a data rate of
about 42MB per second that the graphic controller would need
to shuffle. Due to the fact that the screen contains blanks,
naturally, there are times in which the graphic controller
does not access the memory at all, but these 42MB per second
are the peak rate that the main memory must perform just to
feed the graphic controller during a line build up.
This, by the way, exceeds the memory bandwidth of the Falcon
by far. The maximum the Falcon can do is 768 pixels a line
in 16-bit colour depth on RGB, using a horizontal refresh
of 15KHz, which results in a peak rate of below 24MB per
second.


Chapter Four: Why do chunky mode on the ST ?
----------------------------------------------

Now the question arises naturally: Why simulate a graphic
mode on the Atari ST in which every pixel is stored compact,
for example one byte per pixel. And the answer is: To accessa
each pixel directly.
If you try to perform new-school effects on the Atari ST, you
will very soon see that this is essential. Most old-school
demos either copied and moved blocks of graphics around
(sprites, wobblers, scrollers etc) or calculated graphics in
a way that they were relatively easy to handle in the natural
plane mode (line vector graphics, dots, filled vector graphics).
New school effects however usually rely badly on single pixel
manipulation and let's discuss a simple zoomer for example.

A Zoomer usually works this way: The computer reads a source
picture and copies the pixels of that into a target picture
which could be the screen. Now after copying one pixel, it
increments the source pointer by a certain offset. If this
offset turns out to be one, an identical copy will result.
If this offset is for example 2, the copy will lack half of
the pixels of the source, therefore be reduced by a factor
of 2. If the offset is a real number and for example 0.5,
each pixel of the source would be read twice and the copy
would be twice as large.
Without going into detail about how a real number can be
used as an offset, you automatically see that we badly need
to access each pixel individually - We cannot handle the
source picture blockwise and copy 4, 8 or 16 pixels at
once for a zoom effect. In other words, we need the source
picture in chunky mode and the planar mode of the Atari ST
doesn't help us at all.


Chapter Five: Chunky to Planar
--------------------------------

There have been several ways how to convert the chunky
pixel information as required by the effect to planar pixel
information as handled by the Atari ST graphic controller,
the Shifter. Let's suppose we have already written the
zoom-effect by itself and it will generate a picture in
chunky format, with 1 byte per pixel. Because the Atari
ST's Shifter can only handle 16 colours directly, we
suppose that only the low 4 bits of each byte is used,
but that is not really relevant at the moment.
Let's suppose, we have 4 pixels at the very start of this
picture of the following type:

             Pixel:    0    1    2    3
    Chunky Picture:   03   0C   0D   0A in Bytes

The very left pixel has the colour 3, the following the
colour 12, the next colour 13 and the last one colour 10.
Now what we need to turn it into display these 4 pixels
in the correct colours on the Atari ST, the result would be

      Planar Picture:
          Bitplane 0: 1010 0000 0000 0000
          Bitplane 1: 1001 0000 0000 0000
          Bitplane 2: 0111 0000 0000 0000
          Bitplane 3: 0110 0000 0000 0000 in single Bits
        Pixel colour: 3CDA

If you study the above example carefully, you understand the
basic transformation that is required: We need to spread the
bits of one pixel, that are stored compact in the chunky mode,
over the bitplanes and on the Atari ST, this means over 4
words.
This could be hardcoded, for example the following way:
We could read the chunky pixel, then set each bit assigned
to this pixel in the 4 words, representing the 4 bitplanes,
individually:

 - Read pixel 0 from chunky picture
 - Set/Unset Bit 0 to 0 or 1 in word 0
 - Set/Unset Bit 0 to 0 or 1 in word 1
 - Set/Unset Bit 0 to 0 or 1 in word 2
 - Set/Unset Bit 0 to 0 or 1 in word 3
 - Read pixel 1 from chunky picture
 - Set/Unset Bit 1 to 0 or 1 in word 0
 - Set/Unset Bit 1 to 0 or 1 in word 1
 - Set/Unset Bit 1 to 0 or 1 in word 2
 - Set/Unset Bit 1 to 0 or 1 in word 3
 - ...

However, this would involve "touching" each bitplane's word
16 times to set every pixel, not to mention the decision
making whether a bit needs to be set or not (depending on
the pixel's colour) and would therefore be very very slow.
If we reduce the virtual resolution and to 160 pixels a
line, we could set 2 planar pixels at once, reducing the
required CPU-time to 1/2, but still, it would be far too
slow for regular usage.
Assembly programmers tend to use tables to speed things
up, in other word, precalculate all the data that might
be needed and then "simply" collect the part of the data
needed at runtime.
How would a table look like for this problem ?
We read the pixel's colour from the chunky picture compact
in 1 byte. This byte could then be used to find the assigned
value(s) in plane format. However, as the pixel information
is spread over 4 words in plane format, we would need to read
8 bytes each time we convert one chunky pixel to one plane
pixel. Not to mention the fact that the pixel position is also
a parameter we need to consider: The very left pixel (Pixel 0)
has position 15 in each of the 4 bitplane words, the next pixel
(Pixel 1) has position 14 in each of the 4 bitplane words and
so on until the right-most pixel (Pixel 15), which occupies
Bit 0 in each of the 4 bitplane words. In other words, we would
need 16 tables of 8 bytes each to convert each chunky pixel of
a set of 16 chunky pixels to the correct bitplane pixels.
Now, if we again reduce the virtual resolution to the half
phyiscal one, we can write 2 pixels at once in plane mode,
which gives us a boost, but still, we need to read 8 bytes
every time we convert one chunky pixel to a planar pixel.

This is where a special instruction of the 68000 comes to
help us an awful lot: The movep-instruction.
MoveP initially stood for move-parallel and was introduced
to easier feed 8 bit-parallel interface systems (like
Centronics or SCSI), but indeed it looks like it was just
meant for chunky to planar conversion.
The movep instruction reads a whole longword, meaning 32
bits at once, and spread the 4 bytes contained in these
32 bits over 4 words. It accepts a direct offset and if
that offset is even, it will put the 4 bytes into the
high-bytes of the 4 words, and if this offset is odd, it
will put the 4 bytes into the low-bytes of the 4 words.

How does that fit into the scheme described above ?
It will actually do exactly what costs us most of the CPU-time
in the scheme described above, it will shorten the tables we
need and spread the information over the words as required.

What we now need is a table of 4 bytes each. Each bitplane
is represented by 1 byte of these 4 and therefore contains 4
pixels, if we stick to the reduced horizontal resolution.
The table will contain for a single pixel all possible
colours, so that, if we use the colour of our looked up
chunky pixel, we read the left byte of each bitplane this
pixel appears in. Let's look at a example how this table
looks for the very left pixel:

  dc.b %00000000,%00000000,%00000000,%00000000 ;Colour 0
  dc.b %11000000,%00000000,%00000000,%00000000 ;Colour 1
  dc.b %00000000,%11000000,%00000000,%00000000 ;Colour 2
  dc.b %11000000,%11000000,%00000000,%00000000 ;Colour 3
  dc.b %00000000,%00000000,%11000000,%00000000 ;Colour 4 and so on

Now, for pixel 2, the table would look identical, just that
the second bitplane pixel is set while all other entries are
zero:

  dc.b %00000000,%00000000,%00000000,%00000000 ;Colour 0
  dc.b %00110000,%00000000,%00000000,%00000000 ;Colour 1
  dc.b %00000000,%00110000,%00000000,%00000000 ;Colour 2
  dc.b %00110000,%00110000,%00000000,%00000000 ;Colour 3 and so on

The tables for pixel 3 and 4 are similar. And as one byte
can only contain 4 pixels in the resolution we decided upon,
we only need these 4 tables of 16 entries each.

What's the big deal, you say ? The big deal is that we can read
all information needed to convert one chunky pixel to one bitplane
(double-)pixel in one longword access! Instead of reading 4 words
for all 4 bitplanes, we only need to read one 32-Bit word.
So, what we do is we read the colour value of our first chunky
pixel and look up the correct line in the table and read the
whole word (which will then contain the left byte of all 4
bitplane-words for this first pixel). Then we read the second
chunky pixel and look up the correct line in the second table
and OR-combine it with the first one.
We repeat this procedure for pixel 3 and 4, and then we have
all 4 chunky pixels we read stored in all 4 bitplanes correctly
in one longword in a register of your choice.
And then, we use movep as it will spread the 4 bytes in the
register over 4 bitplane words in the target area:

  movep.l #%11001111 11001100 00110000 00110011, 0(A0)

 will result in

   %11001111 00000000
   %11001100 00000000
   %00110000 00000000
   %00110011 00000000

 at the address stored in A0 while

  movep.l #%11001111 11001100 00110000 00110011, 1(A0)

 will result in

   %00000000 11001111
   %00000000 11001100
   %00000000 00110000
   %00000000 00110011

 at the address stored in A0.

This sets the procedure we need to follow to successfully
convert all pixels of our source chunky picture to planar:
We first convert 4 pixels using the procedure described above,
put them to screen using movep on an even address, then
convert the next 4 pixels using the procedure described above
and put them to screen using movep on an odd address:

      lea c2p_pixel0,a0      ;c2p table for pixel 0 to a0
      lea c2p_pixel1,a1      ;c2p table for pixel 1 to a1
      lea c2p_pixel2,a2      ;c2p table for pixel 2 to a2
      lea c2p_pixel3,a3      ;c2p table for pixel 3 to a3
      lea source_pic,a4      ;the source picture
      lea target,a5          ;the target

      moveq #0,d0            ;clear work register

      move.w #no_lines,d6    ;number of lines
.outloop:
      move.w #no_pixels,d7   ;number of pixels per line
.inloop:
      move.b (a4)+,d0        ;fetch chunky pixel 0
      move.l 0(a0,d0.w),d5   ;convert to planar
      move.b (a4)+,d0        ;fetch chunky pixel 1
      or.l   0(a1,d0.w),d5   ;convert to planar, combine with above
      move.b (a4)+,d0        ;fetch chunky pixel 2
      or.l   0(a2,d0.w),d5   ;convert and combine
      move.b (a4)+,d0        ;fetch chunky pixel 3
      or.l   0(a3,d0.w),d5   ;convert and combine
      movep.l d5,0(a5)       ;put to screen

      move.b (a4)+,d0        ;fetch chunky pixel 4
      move.l 0(a0,d0.w),d5   ;convert to planar
      move.b (a4)+,d0        ;fetch chunky pixel 5
      or.l   0(a1,d0.w),d5   ;convert to planar, combine with above
      move.b (a4)+,d0        ;fetch chunky pixel 6
      or.l   0(a2,d0.w),d5   ;convert and combine
      move.b (a4)+,d0        ;fetch chunky pixel 7
      or.l   0(a3,d0.w),d5   ;convert and combine
      movep.l d5,1(a5)       ;put to screen

      addq.l #8,a5           ;increase target pointer
      dbra d7,.inloop        ;loop through line
      ...                    ;add offsets to source and target
      dbra d6,.outloop       ;loop over lines


And voila ... You successfully converted a picture in the chunky
format originally unknown to the Atari ST to the Atari ST's own
bitplane format.


Chapter Six:  Limitations
---------------------------

The method described above is without a doubt the fastest there
is, it still costs the Atari ST an awful lot of CPU time. Now,
most effects implemented using the chunky graphic format are
kind of slowish already and then you additionally have to
invest some CPU time into converting it so that the Shifter
can read it.
In theory, to save some more CPU time, you might decide to
reduce the resolution even further to probably 80x50. This
is possible and can use the same scheme as described above,
however, you will most probably be disappointed about the
speed boost. C2p in such a low resolution can be done
without using the movep-scheme, being slightly quicker than
using movep, but the benefit in speed is only valuable if
your effect itself runs faster in this resolution, e.g.
reducing the number of pixels to be calculated.
A typical trick to save some CPU time in old-school effects
is to reduce the number of bitplanes involved, however, that
also doesn't really work for the c2p-scheme described above.
Reducing from 4 to 3 bitplanes will not speed up anything at
all and if you go to 2 bitplanes, a gain in speed will only
be achieved when you skip the movep and do it the classic
way (by reading 1 longword per converted pixel and writing
this after all pixels converted). Still, this will not be
twice as fast as doing 4 bitplanes using movep.

The biggest benefit of this scheme however is that it is so
easy to integrate into the effect that you are coding. If
your effect allows one chance of getting rid of the chunky
buffer, do it. Integrate the conversion into your effect,
convert each pixel your effect produces directly and your
routine will speed up a lot, simply because there is no
buffer to be written (step 1) and be interpreted again
(step 2). And basically ALL effects that allow to produce
the pixels horizontally line by line can integrate the c2p
conversion directly. It only becomes difficult when the
pixels produced appear at "random" coordinates, like when
drawing lines at a random angle.


Chapter Seven: Atari owners win again...
------------------------------------------

Remember that slogan from the early 80s ? In fact, Atari ST
users DO win again because the interleaved bitplane format
is, of all bitplane formats, the best one for converting
chunky to plane.

Especially the Amiga users that smiled so much about Ataris
interleaved bitplane format in the early days have to suffer.
The main competitor of the Atari ST organizes its bitplanes
in whole lines. If you have a resolution using 4 bitplanes
and a line is 320 pixels long, bitplane 0 line 0 goes from
byte 0 to byte 19, bitplane 1 line 0 goes from byte 20 to
39 and so on. movep ? Forget it.
Well, Amiga programmers found their ways of doing c2p their
way, but honestly, if you see what an Atari 1040 STF can do
and compare that to the rare demos of today that still run
on an untuned Amiga 500, you'll see the difference.

And another group of computers benefits from their "odd"
hardware that allows chunky-to-planar conversion rather
easily, namely all those using character block graphics.
At a character resolution of 40 blocks a line and a
limited amount of colours, let's say 4, it's very easy
to do chunky-to-planar: Each block will contain 2
horizontal pixels and all combinations can be easily
precalculated. If available, vertical hardware scrolling
can assist in doubling the vertical resolution.
Additional effects can be achieved by introducing virtual
colours through dithering. This can also be precalculated
and especially computers with a low amount of colours can
benefit a lot from that, like the Commodore 64.
On the ZX Spektrum that was not really famous for multi-
coloured graphics, can use the character blocks directly
and do very quick and colourful chunky effects, even
though the resolution applied (32x24) is rather low.
And again, dithering expands the palette a lot so that
on the ZX Spektrum, you see stunningly beautiful plasma,
fire and shade-bob effects.
And even on the Atari ST, there were new school effects
without this way of doing c2p. A fine example is "Amok"
by the Confusions. It runs in midres and expands the
virtual palette by dithering. The biggest advantage of
this idea is that in midres, you only need to set 2
words (1 longword) to completely set 16 planar pixels
and given a virtual resolution of 160 pixels a line,
you only have need to convert 4 chunky pixels for the 2
bitplane-words. This is fairly fast but limits the
graphics quite a lot. Basically, only greyscales look
good in this context.


Chapter Eight: Epilogue
-------------------------

Another boring tutorial, but i don't think the basics of
c2p have ever been described that complete yet. C2p is a
set of routines that you forget to think about once you
understood the aspects of it, but for someone new to coding
new-school effects on the Atari ST, it's quite hard to get
into - At least i suffered from that.
But thanks to Ultra of Cream, Ray of .tSCc. and DefJam of
Checkpoint, even i got the hang of it and hopefully, so
can you.
To give credit to the people behind this c2p-scheme, even
though it's kind of difficult: It seems like the original
idea to apply movep is by Kalms of the Amiga Group The
Black Lotus, and it seems like Llama and Dynacore of
.tSCc. first used the basic idea to do chunky to planar
conversion on the Atari ST.


The Paranoid/Paradox