Catalogs

For complicated lens systems, it can be useful to load the properties of the lenses, of the sources, or of the images from an external catalog. This functionality is implemented in Gravity.jl with a special syntax in the YAML file.

Ungrouped catalogs

Simple, "flat" catalogs will specify a series of records that will be converted, one by one, into an array of YAML elements.

Specifically, the user can add a loop: subsection in any part of the file. The first line in this special subsection is interpreted as iterator: catalog pair. The key iterator is used to specify a variable that will hold the various catalog records; the catalog part can be a simple string, indicating a CSV file (see below).

All following lines will be duplicated with parts using iterator replaced with the current record values. Below is a simple example of this technique:

lenses:
    - NIE:
        name: main
        z: 0.32
        x: (0.0, 0.0) ± 1.0
        σ: 800 .. 1500
        e: 0.3 .. 1.0
        θ: -π/2 .. π/2
    - loop:
        l: cluster-members.csv
        filter: l.id < 3
        SIE:
            name: lens$(l.id)
            z: 0.32
            x: (l.x1, l.x2)
            σ: l.σ ± 50

This configuration file will require a cluster-members.csv catalog such as the following:

id,x1,x2,σ
1,3.21,2.13,210
2,-5.98,4.27,180
3,-8.92,-4.78,190

corresponding to the table

idx1x2σ
13.212.13210
2-5.984.27180
3-8.92-4.78190

Note in particular that

  • In this case, we wish to generate a list of lenses, and therefore the

loop section is used in the lenses section and is preceded it by a dash, as in the example.

  • The CSV catalog file can contain any number of columns in any order.

  • The values in the CSV catalog can be combined freely by the user, by using the iterator name (in this case, l) followed by a dot and the column name.

  • One can optionally filter a subset of the rows by using, as second lined after loop, the filter keyword. This must be followed by an expression that evaluates to a boolean (in the example, only lines with id < 3 will be used).

  • It is possible to use the column values within a string: to this purpose, one needs to use the syntax $(iterator.column), reminiscent of the Julia string interpolation. Note that, similarly to Julia interpolation, within the parentheses one can even enter expressions, as in $(iterator.id * 2 + 1).

In some cases, one might want to use different lens types for the different cluster members. This is also possible, as shown in the following example:

lenses:
    - NIE:
        name: main
        z: 0.32
        x: (0.0, 0.0) ± 1.0
        σ: 800 .. 1500
        e: 0.3 .. 1.0
        θ: -π/2 .. π/2
    - loop:
        l: cluster-members.csv
        l.type:
            z: 0.32
            x: (l.x1, l.x2)
            σ: l.σ ± 50
            s: l.s
            e: l.e
            θ: l.θ

with the catalog

idtypex1x2σseθ
1SIS3.212.13210NANANA
2NIE-5.984.271800.30.80.57
3NIS-8.92-4.781900.2NANA

Note that Gravity.jl will use the proper lens type, depending on l.type, and will also remove from the generated configurations parameters associated to missing values (marked with NA in the configuration file).

The removal of missing parameters propagates in the hierarchy of the configuration file, so that no empty sections are introduced in the configuration. In other words, if in the CSV file above a line like

idtypex1x2σseθ
4SISNANANANANANA

would produce no output.

This behavior can be convenient to generate the source section in the configuration file. In particular, one could list all images associated to a given source in a single record of a CSV catalog, by just using a series of NAs for non-existent images. This allows one to use a different number of images for different sources.

Grouped catalogs

In some cases, it can be necessary to group a single catalog with respect to one or more keywords. A typical case is a single catalog that reports, on each line, a different image, with images part of an image family identified through one or more column. For example, consider the following table

idzsrc_idx1x2
10.6716.82-7.12
20.671-3.20-4.26
31.03212.3211.89
41.0329.307.93
51.0324.898.18

To use such an image catalog, we would need to group sources by src_id (we could also group with respect to z, but we might have troubles if two distinct sources have the same redshift).

The proper way of dealing with this catalog is the following:

sources:
    - loop:
        img: image-catalog.csv
        group-by: src_id
        Point:
            z: img.z
            x: (0.0, 0.0) ± 10.0
            images's:
                Point:
                    x: (img.x1, img.x2) ± 0.05

Note a few key-points:

  • The loop: section contains, in the second line, a group-by: specification. As a result, the following section (the first Point) will be repeated only once for each distinct value of src_id.

  • Within this part, the code will select, for each individual src_id, the first record as representative. Therefore, in principle we could omit the redshift specification for images 2, 3, and 5.

  • As soon as Gravity.jl finds a section ending with the plural mark 's, the corresponding section will be repeated for each individual record with the same src_id. This way, we will be generating two different images for the first source, and three for the second one.

Catalogs in the parameters section

As discussed above, catalogs are often used for lenses and sources, and therefore the loop statement frequently appears in the lenses or sources sections of the configuration file.

However, in some cases it might be necessary to define a list of constant or variable parameters using a table. This need often arises when modeling lenses or sources that require explicit parameter definitions: for example, extended sources with Sersic profiles will typically require an external definition for the Sersic index n.

In these cases, one can use the loop statement in the parameters section:

parameters:
    loop:
        lens: cluster-members.csv
        n_$(lens.id): 1 .. 5

The lines above will define three free parameters n_1, n_2, and n_3, with uniform priors in the range 1 .. 5. These parameters can of course then be used in a loop statement for lenses or sources using a similar n_$(lens.id) notation.

Advanced catalog options

Catalog formats

So far, we have assumed that the input CSV files are properly formatted. The code, in fact, uses the CSV.jl package, which has a number of different keywords. In order to use all its flexibility, it is possible to specify the exact keywords to be passed to CSV.read. The syntax to use in the YAML configuration file is just

- loop:
    img: CSV("filename.csv"; header=3, comment="#", missingstring="NA"...)

So, essentially, one can just list the required format keywords as in the example above.

In-memory catalogs

Alternatively, it is possible to use data in memory. In this case, one needs to load or build an external catalog and store it as a DataFrame. Assuming the DataFrame is stored in a variable called catalog, one can just use within the configuration file the syntax

- loop:
    img: var"catalog"