Save 16 bytes per sky object. (76a92d4c) · Commits · Education / KStars

Commit 76a92d4c authored Dec 21, 2013 by
Henry de Valence
Save 16 bytes per sky object.

In practice, the `long double` type has 16 byte size and alignment.
We can inspect the memory layout of some class inheriting SkyPoint using
clang [1]:

    *** Dumping AST Record Layout
       0 | class StarObject
       0 |   class SkyObject (primary base)
       0 |     class SkyPoint (primary base)
       0 |       (SkyPoint vtable pointer)
       0 |       (SkyPoint vftable pointer)
      16 |       long double lastPrecessJD
      32 |       class dms RA0
      32 |         double D
         |       [sizeof=8, dsize=8, align=8
         |        nvsize=8, nvalign=8]
     ...(snipped)...
     184 |   float B
     188 |   float V
         | [sizeof=192, dsize=192, align=16
         |  nvsize=192, nvalign=16]

The vtable takes up only 8 bytes (on 64-bit), but we waste 8 bytes on
padding. Moreover, we then take up 16 bytes to store lastPrecessJD.
Using a program like the following:

    #include <stdio.h>
    #include <math.h>

    int main()
    {
        double jd2000 = 2451545.0;
        double delta = nextafter(jd2000,jd2000+1) - jd2000;
        printf("delta: %.30f\n", delta);
        return 0;
    }

we can compute that at J2000, the minimum time step at double precision
is approximately 40 microseconds, so it's not clear that we gain
anything by using 80-bit long doubles instead of 64-bit doubles.
Changing the `long double` to `double` (and placing it last) results in
memory layout like so:

    *** Dumping AST Record Layout
       0 | class SkyPoint
       0 |   (SkyPoint vtable pointer)
       0 |   (SkyPoint vftable pointer)
       8 |   class dms RA0
       8 |     double D
         |   [sizeof=8, dsize=8, align=8
         |    nvsize=8, nvalign=8]

      16 |   class dms Dec0
      16 |     double D
         |   [sizeof=8, dsize=8, align=8
         |    nvsize=8, nvalign=8]

      24 |   class dms RA
      24 |     double D
         |   [sizeof=8, dsize=8, align=8
         |    nvsize=8, nvalign=8]

      32 |   class dms Dec
      32 |     double D
         |   [sizeof=8, dsize=8, align=8
         |    nvsize=8, nvalign=8]

      40 |   class dms Alt
      40 |     double D
         |   [sizeof=8, dsize=8, align=8
         |    nvsize=8, nvalign=8]

      48 |   class dms Az
      48 |     double D
         |   [sizeof=8, dsize=8, align=8
         |    nvsize=8, nvalign=8]

      56 |   double lastPrecessJD
         | [sizeof=64, dsize=64, align=8
         |  nvsize=64, nvalign=8]

This also has the benefit that the SkyPoint data fits in a single cache
line, though I don't think this really makes a difference given the
inefficiencies in the rest of the code. A before/after test showed a
drop in memory usage of about 6%.

[1]: http://eli.thegreenplace.net/2012/12/17/dumping-a-c-objects-memory-layout-with-clang/

CCMAIL: kstars-devel@kde.org
parent a49dec48
Hide whitespace changes
Inline Side-by-side
Please register or to comment