Skip to content
Update Data preparation workflow authored by Stephen Parsons's avatar Stephen Parsons
[[_TOC_]]
## Before you get started
Command-line applications from Volume Cartographer and registration-toolkit are shown being run by name in this document.
If these applications have not been installed to your system path, you must modify the provided examples to point to the location of your compiled applications, often the `build/bin` folder for each project.
For example, you might do the following to run `vc_render` from the `volume-cartographer` folder in your home directory:
# Before starting
Command-line applications from Volume Cartographer and registration-toolkit are shown being run by name in this document. If these applications have not been installed to your system path, you must modify the provided examples to point to the location of your compiled applications, often the `build/bin` folder for each project. For example, you might do the following to run `vc_render` from the `volume-cartographer` folder in your home directory:
```shell
~/volume-cartographer/build/bin/vc_render ...
```
You can temporarily add the volume-cartographer bin directory to your path using the following:
You can temporarily add the `bin/` directory from `volume-cartographer` to your path using the following:
```shell
export PATH=$PATH:~/volume-cartographer/build/bin/
```
# Inventory
Files may be distributed to begin with. Check the following documents and locations to make an inventory of what work has been done on this object and where it is currently stored. If necessary, merge the work done into one central location. The preferred central location is `gemini1-2:/mnt/gemini1-4/seales_uksr/nxstorage/data/` in a directory structure such as (example) `Herculaneum_Scrolls/PHercParis2/Frag47/PHercParis2Fr47.volpkg`. However, `gemini` is only usable for storage and not active work. So start by consolidating things on `lcc:/pscratch/seales_uksr/` or your local machine. The former may be necessary due to disk space.
* [Moonshot Data Progress Tracking Sheet](https://docs.google.com/spreadsheets/d/16s8GkQ74w5fmp6d1MwYGtmcf26gk9PjrD_ldManLhKw/edit?usp=sharing) (old)
* [CT Data Manifest](https://luky-my.sharepoint.com/:x:/r/personal/cpa232_uky_edu/Documents/Projects/2022/20221223%20-%20CT%20Manifest/CT%20Data%20Manifest.xlsx?d=w63135ef37bd7489d8314b1cad1bc218c&csf=1&web=1&e=VSUpkm) (new)
* `gemini1-1:/mnt/gemini1-3/seales_uksr/`
* `gemini1-2:/mnt/gemini1-4/seales_uksr/`
* `lcc:/pscratch/seales_uksr/`
## Optional: Extract/crop hdf files on LCC resources
For particularly large datasets (such as those split into slabs) the entire dataset may not fit on your desktop machine. In these cases it may then be more efficient to crop the source files on the source server BEFORE transferring it to your desktop for tasks requiring a graphical interface/user intervention. This allows for slabs to be processed in parallel up till volume packaging and greatly reduces the size of the initial data transfer.
### One time setup:
For convenience, a singularity container and skeleton slurm script have been placed in the DRI Datasets drive directory under Resources. These will allow for easy use of the relevant scripts from volume cartographer on the LCC servers, and should be copied to your scratch space before you begin:
```shell
rclone copy dri-datasets-remote:/Resources/ $SCRATCH/data_processing_temp_space/ -v
```
The slurm script included should be lightly edited to be specific to the user. In particular the email field should be changed to the relevant address.
### File transfer
You will need to move the source hdf slab(s) to your scratch space on the LCC. This is because we will be running crop and packaging on one of the LCC worker nodes for which the ```/gemini1-3/``` location is not mounted, so the files need moved somewhere the job nodes can find them:
You will need to move the source hdf slab(s) to your scratch space on the LCC. This is because we will be running crop and packaging on one of the LCC worker nodes for which the `/gemini1-3/` location is not mounted, so the files need moved somewhere the job nodes can find them:
On the LCC data transfer node:
```shell
rclone copy /path/to/large/dataset $SCRATCH/data_processing_temp_space/ -v --include *.hdf
```
......@@ -36,11 +51,13 @@ rclone copy /path/to/large/dataset $SCRATCH/data_processing_temp_space/ -v --inc
This will copy the hdf5 source files to your scratch space, but not the previously extracted .tif files, to view these .tif files and determine the appropriate crop dimensions to perform, you will still need to transfer these files to your work machine and view them in ImageJ/Photoshop.
On your work machine:
```shell
rclone copy dtn-remote:/path/to/large/dataset ./data_processing_temp_space/ -v --include *.tif
```
### Running extract/crop on LCC
Now use sbatch and the previously copied slurm scripts to run extract/crop on the LCC system. The parameters passed to this script will be passed along to the hdf5_to_tif.py file included with volume cartographer, so should be treated in the same way:
```shell
......@@ -50,27 +67,22 @@ sbatch run_hdf_to_tif.sh --input-file ./data_processing_temp_space/slab_file.hdf
This should run fairly quickly and multiple slabs can be processed at once. Now you can transfer these cropped slices to your workstation with rclone and proceed to packaging as normal. Don't forget to include with your volume package the details of the crop performed, as specified below.
## Transfer data
Get the original volume or slices onto the machine you are using to process the dataset. This depends on context, but typically we use scp, rclone, etc.
Get the original volume or slices onto the machine you are using to process the dataset. This depends on context, but typically we use scp, rclone, etc.
## Extract slices (HDF5 files only)
** Coming soon **
\*\* Coming soon \*\*
## Crop slices (optional)
Many of our datasets are too large to be processed efficiently in their native format.
Cropping is the preferred method for reducing size as it maintains the spatial resolution of the scan.
Scan through the slices to determine a good bounding box for the object in the scan.
Test your crop using the `convert` utility provided by ImageMagick.
The following command creates a 9060x1794 image starting at pixel (670,830) in the `full_slice_0000.tif` input image:
Many of our datasets are too large to be processed efficiently in their native format. Cropping is the preferred method for reducing size as it maintains the spatial resolution of the scan. Scan through the slices to determine a good bounding box for the object in the scan. Test your crop using the `convert` utility provided by ImageMagick. The following command creates a 9060x1794 image starting at pixel (670,830) in the `full_slice_0000.tif` input image:
```shell
convert -crop 9060x1794+670+830 original/full_slice_0000.tif test.tif
```
Review the resulting test image to ensure that the crop looks correct.
When you are satisfied with your crop settings, you can mass crop your images using the `mogrify` command.
**Warning:** By default, `mogrify` will modify images in-place. Be sure you have properly specified the `-path` argument to avoid data loss:
Review the resulting test image to ensure that the crop looks correct. When you are satisfied with your crop settings, you can mass crop your images using the `mogrify` command. **Warning:** By default, `mogrify` will modify images in-place. Be sure you have properly specified the `-path` argument to avoid data loss:
```shell
mkdir -p cropped_slices/
......@@ -88,8 +100,7 @@ find original/ -type f -name 'full_slice_*.tif' | parallel --bar convert -crop '
find original/ -type f | grep -E "full_slice_*.tif" | parallel --bar convert -crop '9060x1794+670+830' {} cropped/{/}
```
If you crop a dataset, please save the crop information into a text file and store it alongside the volume in the Volume Package.
The following example matches the above crop command:
If you crop a dataset, please save the crop information into a text file and store it alongside the volume in the Volume Package. The following example matches the above crop command:
```
CROP
......@@ -98,14 +109,14 @@ Output: cropped/
Crop: 9060x1794+670+830
```
## Resize volume (optional)
** Coming soon **
\*\* Coming soon \*\*
## Create .volpkg
If your object does not yet have a Volume Package, use `vc_packager` to create one.
New packages require the following flags:
If your object does not yet have a Volume Package, use `vc_packager` to create one. New packages require the following flags:
* `--volpkg (-v)`: The name of the Volume Package
* `--material-thickness (-m)`: The estimated thickness of a layer (i.e. page) in microns
* `--slices (-s)`: Path to an importable volume.
......@@ -113,6 +124,7 @@ New packages require the following flags:
If you are adding a volume to an existing Volume Package, only the path to the Volume Package and the path to the importable volume are required.
Example:
```shell
# Create a new Volume Package
vc_packager -v PHerc2Fr143.volpkg -m 200 -s PHerc2Fr143/volumes/54kV_cropped/
......@@ -121,8 +133,8 @@ vc_packager -v PHerc2Fr143.volpkg -m 200 -s PHerc2Fr143/volumes/54kV_cropped/
vc_packager -v PHerc2Fr143.volpkg -s PHerc2Fr143/volumes/88kV_cropped/
```
`vc_packager` supports importing Grayscale or RGB images in the TIFF, JPG, PNG, or BMP formats with 8 or 16 bits-per-channel.
The `--slices` option accepts a number of different importable volume types:
`vc_packager` supports importing Grayscale or RGB images in the TIFF, JPG, PNG, or BMP formats with 8 or 16 bits-per-channel. The `--slices` option accepts a number of different importable volume types:
* **Path to a Skyscan reconstruction log file**: If the dataset is a reconstruction from a Skyscan scanner, provide the path to the Skyscan reconstruction log file (`.log`). In addition to adding the slices to the Volume Package, `vc_render` will also automatically import associated scan metadata, such as the voxel size.
* **printf-style file pattern for slice images**: If a directory of slice images also contains images that are not slices, specify the file pattern for the slice images using a printf-style pattern. Currently, only `%d%` and `%00d` replacements are supported.
* **Path to a directory of slices**: If a directory contains only slice images, simply specify the path to the directory. Image detection is greedy, and any images in the directory that are not slice images will cause errors. Note that lexicographical sort is used to arrange the images, therefore it is a good idea to zero-pad any numerical orderings in the image filenames.
......@@ -139,63 +151,61 @@ vc_packager -v ObjectName.volpkg -s MixedData/slices_%04d.tif
vc_packager -v ObjectName.volpkg -s OnlySlices/
```
## Segmentation
### Hidden layers
Now two options are available for segmenting hidden layers:
1: For hidden layers or wrapped objects, you can use the VC GUI app to perform segmentation.
This will create a new Segmentation directory in the `paths` subdirectory of the Volume Package.
Make sure to add the Segmentation ID to the progress tracker spreadsheet.
1: For hidden layers or wrapped objects, you can use the VC GUI app to perform segmentation. This will create a new Segmentation directory in the `paths` subdirectory of the Volume Package. Make sure to add the Segmentation ID to the progress tracker spreadsheet.
2: A new segmentation approach exists, currently referred to as "Thinned Flood Fill Segmentation". This algorithm is not incorporated into the VC GUI app at this time. However, this algorithm is very useful for quickly segmenting layers of manuscripts where separation between layers is obvious in the tomographic data. The instructions for using Semi-Automated Segmentation in its current form are outlined below:
* First, create a directory to hold your segmentations. All volume packages should have a working directory that would be a good place to create a subdirectory to hold your segmentations.
* Open the VC GUI app and load the volume package you want to segment from. Start a new segmentation and place the points along the layer you want to segment. Make sure to disable the Pen Tool when you are finished so the points are saved.
* Next, open a terminal window, navigate to the directory you created, and execute vc_segment. A template for this is provided below. The segmentation files will be saved in the current working directory. Be careful not to overwrite the data you created in previous executions.
* dump-vis, save-mask, save-interval (set at 1) are optional, but strongly recommended parameters.
* -l is the low threshold parameter. It takes a 16-bit grayscale value (0-65535). For M.910, a value near 13000-14000 is typically a good choice.
* tff-dt-thresh (a float value, ranging from 0.-1.) determines how much of the mask will be pruned away before the thinning algorithm begins. Setting this too high will disrupt the continuity of the skeleton. Setting it such that the continuity is not disrupted is important. Leaving this parameter out will result in none of the mask being pruned away, so this is a good option if the layer is extremely thin. This parameter can be useful if set correctly, as some segmentation errors will be eroded away.
```
path/to/volume-cartographer/build/bin/vc_segment -m TFF -s VC_SEGMENTATION_ID -v /path/to/MS910.volpkg/ --start-index START_SLICE --end-index END_SLICE -l LOW_GRAY_THRESHOLD_16_BIT_GREY --tff-dt-thresh THRESHOLD --dump-vis --save-mask --save-interval 1
```
(Fall 2020) M.910 Common Settings:
```
path/to/volume-cartographer/build/bin/vc_segment -m TFF -s VC_SEGMENTATION_ID -v /path/to/MS910.volpkg/ --start-index START_SLICE --end-index 5480 -l 13500 --tff-dt-thresh .2 --dump-vis --save-mask --save-interval 1
```
* If you enabled dump-vis, a new directory called 'debugvis' will appear inside the working directory. Two directories inside, called 'mask' and 'skeleton', contain images that show what is being segmented. You can use these images as a reference to help you determine when to stop the segmentation.
To obtain a good-quality segmentation, the mask must cover the majority of the layer of interest, but it is fine if some small parts aren't covered or parts of neighboring pages get segmented too.
This is an example of a good-quality segmentation: https://drive.google.com/file/d/1_qzL2L2gZpYHYUJznCZENbsW2ueUj8__/view?usp=sharing
To obtain a good-quality segmentation, the mask must cover the majority of the layer of interest, but it is fine if some small parts aren't covered or parts of neighboring pages get segmented too. This is an example of a good-quality segmentation: https://drive.google.com/file/d/1_qzL2L2gZpYHYUJznCZENbsW2ueUj8_\_/view?usp=sharing
Check the segmentation occasionally. If a lot of neighboring pages are getting segmented or if the segmentation loses the layer you are segmenting, use Ctrl-C to kill vc_segment **provided you ran vc_segment with ```--save interval 1```**.
Check the segmentation occasionally. If a lot of neighboring pages are getting segmented or if the segmentation loses the layer you are segmenting, use Ctrl-C to kill vc_segment **provided you ran vc_segment with `--save interval 1`**.
* Save all of the files outputted by vc_segment into a subdirectory. (For example, if you are segmenting page 20, you could name the directory 20_1 if it was your first segmentation of that page. Just be consistent.)
* Go back to the VC GUI app and create another *new* segmentation (do not try to re-use the old one; VC will crash) at the first slice that the previous segmentation did badly on. Run vc_segment with the new segmentation, starting at the new segmentation's slice number. Repeat this process of running, stopping, and resetting until the page is segmented as much as possible (near the end of the volume it may become very difficult to keep going (for M.910, this can happen when you are in/near the 4000s) so just stop if you are near the end of the volume and it is too difficult.)
* Go back to the VC GUI app and create another _new_ segmentation (do not try to re-use the old one; VC will crash) at the first slice that the previous segmentation did badly on. Run vc_segment with the new segmentation, starting at the new segmentation's slice number. Repeat this process of running, stopping, and resetting until the page is segmented as much as possible (near the end of the volume it may become very difficult to keep going (for M.910, this can happen when you are in/near the 4000s) so just stop if you are near the end of the volume and it is too difficult.)
* Merge the vcps files to make a single large file. Do this by using vc_merge_pointsets. Usage:
```
path/to/volume-cartographer/build/bin/vc_merge_pointsets -i path/to/dir_containing_all_pointsets -o path/to/output_dir
```
Note: vc_merge_pointsets takes a directory that contains all vcps files you wish to merge as input. Put all vcps files in there. You will need to rename them, but make sure to keep the .vcps extension.
Note: Use the --prune flag to prune the pointsets during the merge. **Important: every vcps file must be named the last slice number of that segmentation that you wish to keep (example: a segmentation that goes to 1200 slices but is only valid until 1150 should be named 1150.vcps)** (Coming soon)
Note: vc_merge_pointsets takes a directory that contains all vcps files you wish to merge as input. Put all vcps files in there. You will need to rename them, but make sure to keep the .vcps extension. Note: Use the --prune flag to prune the pointsets during the merge. **Important: every vcps file must be named the last slice number of that segmentation that you wish to keep (example: a segmentation that goes to 1200 slices but is only valid until 1150 should be named 1150.vcps)** (Coming soon)
* Convert the vcps file into a point cloud. Do this by using vc_convert_pointset. Usage:
```
path/to/volume-cartographer/build/bin/vc_convert_pointset -i pointset_name.vcps -o output_mesh_name.obj
```
- (Fall 2020) Upload the merged point cloud and all pointsets (mask_pointset.vcps and pointset.vcps for each segmentation) to the DRI Experiments Shared Drive inside the folder linked here: https://drive.google.com/drive/folders/1U7wg1mGDlg6wLsRx_EtCIEMsNmh8yxJj?usp=sharing
Create a new folder named after the page number, and put the .vcps files, point clouds, and meshes inside that folder.
- (Fall 2020) Upload the merged point cloud and all pointsets (mask_pointset.vcps and pointset.vcps for each segmentation) to the DRI Experiments Shared Drive inside the folder linked here: https://drive.google.com/drive/folders/1U7wg1mGDlg6wLsRx_EtCIEMsNmh8yxJj?usp=sharing Create a new folder named after the page number, and put the .vcps files, point clouds, and meshes inside that folder.
- **(Fall 2020: ignore this step)** These point clouds need further cleaning before they can be turned into a mesh. This process is done in Meshlab. Refer to the processing instructions for canny segmentation below (see the "Exposed layers" section.) The same steps will be necessary for these point clouds.
### Exposed layers
For flat, exposed layers, manually segment the layer using the canny edge detection segmentation utility.
To better keep track of these manually define segmentations, first make a working directory for your segmentation inside the Volume Package and then run `vc_canny_segment`:
For flat, exposed layers, manually segment the layer using the canny edge detection segmentation utility. To better keep track of these manually define segmentations, first make a working directory for your segmentation inside the Volume Package and then run `vc_canny_segment`:
```shell
# Make a working directory inside the volpkg
......@@ -208,16 +218,14 @@ cd working/54kv_surface_layer/
vc_canny_segment -v ../../ --volume 20200125113143 --visualize -o canny_raw.ply
```
For each slice in the specified volume, `vc_canny_segment` runs a canny edge detector to isolate surfaces.
It then marches along each row or column of the image and projects a line from an edge of the image onto the canny-detected edges.
By default, the first edge detected becomes an 3D point in the output point set.
This utility has a number of useful options for controlling the segmentation:
For each slice in the specified volume, `vc_canny_segment` runs a canny edge detector to isolate surfaces. It then marches along each row or column of the image and projects a line from an edge of the image onto the canny-detected edges. By default, the first edge detected becomes an 3D point in the output point set. This utility has a number of useful options for controlling the segmentation:
* `--projection-edge (-e)`: The edge to project from when detecting surface points. Accepts: `L, R, T, B` for Left, Right, Top, and Bottom respectively.
* `--visualize`: Opens a window that lets you preview the canny edge detection options. Adjust the parameters until the edge of the fragment you are trying to segment is highlighted in white, but any background noise is not highlighted. Make sure the min threshold is less than the max threshold. If they are flipped, the finished segmentation may not look as expected. Once done, hit any key to run the segmentation.
* `--mask`: Provide a B&W image where white indicates the region of the volume to consider for edge detection. Any canny edges in the slices that overlap with the black portion of the mask will be ignored.
The output of this process, `canny_raw.ply`, is a dense point set and requires further processing in Meshlab:
0. Select `View -> Toggle Orthographic Camera` in Meshlab to use the orthographic camera, which removes perspective distortion when viewing the point cloud and makes it easier to select regions manually.
The output of this process, `canny_raw.ply`, is a dense point set and requires further processing in Meshlab: 0. Select `View -> Toggle Orthographic Camera` in Meshlab to use the orthographic camera, which removes perspective distortion when viewing the point cloud and makes it easier to select regions manually.
1. Run `Filters/Point Set/Point Cloud Simplification` to reduce the point set to a reasonable size. If the surface is very smooth, use fewer points. Usually, within the order of 10k to 100k points typically retains enough detail while significantly speeding up later steps. Save this point set with the name: `01_simplified.ply`.
2. Manually select and delete points that are not on the desired surface.
3. Run `Filters/Selection/Select Outliers` and then delete the selected vertices. This cleans up groups of points that are not on the surface. It is recommended to enable the Preview option while tuning the selection options.
......@@ -225,23 +233,20 @@ The output of this process, `canny_raw.ply`, is a dense point set and requires f
5. Run `Filters/Remeshing, Simplification and Reconstruction/Surface Reconstruction: Screened Poisson` to triangulate the surface. This filter uses the surface normals generated in the previous step to fit a continuous surface to the point set. Increase the Reconstruction depth to make the surface fit more closely to the original point set at the expense of more faces and a rougher surface. Typically, use a reconstruction depth in the range of 8-10. Save this mesh to your working directory with the name: `canny_poisson.ply`
6. Poisson will create faces which extend beyond the original point set. Run `Filters/Sampling/Hausdorff Distance` to add an attribute to each vertex of the new surface that is that vertex's distance to the nearest point in the original point set.
7. Run `Filters/Selection/Select by Vertex Quality` to select those vertices in the Poisson surface which have large distances from the original point set. Use the Preview option to tune the selection. Delete the selected vertices and faces.
8. Run `Filters/Cleaning and Repairing/Remove T-Vertices by Edge Flip`. The number of removed t-vertices will be reported in the log panel.
9. Run `Filters/Quality Measure and Computations/Compute Topological Measure`. A topology report will be printed in the log panel in the bottom-right of Meshlab. Based on what is reported, perform the following steps. Repeat this step as needed.
* `Unreferenced Vertices N > 0`: Run `Filters/Cleaning and Repairing/Remove Unreferenced Vertices`. The number of removed vertices will be reported in the log panel.
* `Mesh is composed by N > 1 connected component(s)`: Run `Filters/Cleaning and Repairing/Remove Isolated pieces (wrt Face Num.)` to remove small, disconnected connected components. Use a component size of 1000 or larger to ensure that all surfaces that you will remove are not connected to your segmented surface. The number of removed connected components will be reported in the log panel.
* `Mesh has N > 0 non two manifold edges...`: Run `Filters/Cleaning and Repairing/Repair non Manifold Edges by removing faces`.
* `Mesh has N > 0 holes`: Run `Filters/Remeshing, Simplification and Reconstruction/Close Holes`. Adjust `Max size to be closed` to large values until all holes are closed.
10. Save your final mesh as a new file with a name which matches your working directory (e.g. `54kv_surface_layer.ply`). After selecting the output file location, a window with saving options will open. Click the box to uncheck `Binary encoding` to save the file in an ASCII format. **This is required for using this mesh with vc_render.**
## Texturing
All texturing should be performed with the `vc_render` command-line application. Do not use VC Texture.app.
All texturing should be performed with the `vc_render` command-line application. Do not use VC Texture.app.
### Segmentations from VC.app
Make a new working directory for your segmentation inside the Volume Package and provide `vc_render` with the volume package and segmentation ID of your segmentation:
```shell
......@@ -255,8 +260,8 @@ cd working/54kv_internal_layer/
vc_render -v ../../ -s 20200125113143 --output-ppm 54kv_internal_layer.ppm --uv-plot 54kv_internal_layer_uvs.png --method 1 -o 54kv_internal_layer.obj
```
### Segmentations from canny segmentation
Provide `vc_render` with the volume package, the final mesh produced by Meshlab, and the ID of the segmented volume:
```shell
......@@ -266,45 +271,34 @@ Provide `vc_render` with the volume package, the final mesh produced by Meshlab,
vc_render -v ../../ --input-mesh 54kv_surface_layer.ply --volume 20200125113143 --output-ppm 54kv_surface_layer.ppm --uv-algorithm 2 --uv-plot 54kv_surface_layer_uvs.png --method 1 -o 54kv_surface_layer.obj
```
### Retexturing the segmentation
The above commands generate a texture image using the Intersection texture method (`--method 1`).
This is the fastest texturing method and will help you more quickly verify that your flattened surface is correctly oriented and contains no significant flattening errors.
However, this image is not always useful for aligning the reference image.
If you have difficulty finding point correspondences in the [registration step](#align-the-reference-image), use the `vc_render_from_ppm` utility to generate new texture images using alternative parameters:
The above commands generate a texture image using the Intersection texture method (`--method 1`). This is the fastest texturing method and will help you more quickly verify that your flattened surface is correctly oriented and contains no significant flattening errors. However, this image is not always useful for aligning the reference image. If you have difficulty finding point correspondences in the [registration step](#align-the-reference-image), use the `vc_render_from_ppm` utility to generate new texture images using alternative parameters:
```shell
vc_render_from_ppm -v ../../ -p 54kv_surface_layer.ppm --volume 20200125113143 -o 54kv_surface_layer_max.png
```
There are many texturing parameters available in both `vc_render` and `vc_render_from_ppm`.
We have found that the following alternatives are consistently useful for generating new textures:
There are many texturing parameters available in both `vc_render` and `vc_render_from_ppm`. We have found that the following alternatives are consistently useful for generating new textures:
* Composite method, Max filter: The default texturing method if no options are passed. Returns the brightest intensity value in the neighborhood. Enable with these options: `--method 0 --filter 1`
* Composite method, Mean filter: Return the average of the neighborhood's intensity values. Useful if the dataset is noisy. Enable with these options: `--method 0 --filter 3`
* Integral method: Return the sum of the neighborhood's intensity values. Sometimes shows subtle details that are missed by the Composite method. Enable with these options: `--method 2`.
* Adjust the texturing radius: The size of the texturing neighborhood is automatically determined by the Volume Package's material thickness metadata field. Because this value is an estimate of a layer's thickness, it is sometimes too small/too large. To manually set the search radius, provide the `--radius` option a real value in voxel units.
### Speeding up flattening and PPM generation
The processing times for flattening and PPM generation are sensitive to the number of faces in the segmentation mesh.
In particular, meshes generated from the `vc_canny_segment` process are often densely sampled, thus leading to long processing times.
For these meshes, use the mesh resampling options in `vc_render`:
The processing times for flattening and PPM generation are sensitive to the number of faces in the segmentation mesh. In particular, meshes generated from the `vc_canny_segment` process are often densely sampled, thus leading to long processing times. For these meshes, use the mesh resampling options in `vc_render`:
```shell
vc_render -v ../../ --input-mesh 54kv_surface_layer.ply --volume 20200125113143 --enable-mesh-resampling
```
See `vc_render --help` for more options related to resampling.
This flag is enabled by default for segmentation inputs passed with the `-s` option, but disabled for all inputs pass with `--input-mesh`.
The number of vertices in the output mesh can be controlled with the `--mesh-resample-factor` option, which sets the approximate number of vertices per square millimeter in the resampled mesh.
Newer versions of volume-cartographer (5abb42db and up) additionally have the `--mesh-resample-vcount` option which exactly controls the number of vertices in the output mesh.
Be careful to not set the vertex count value too low, as this can modify your mesh such that it no longer intersects the object's surface.
See `vc_render --help` for more options related to resampling. This flag is enabled by default for segmentation inputs passed with the `-s` option, but disabled for all inputs pass with `--input-mesh`. The number of vertices in the output mesh can be controlled with the `--mesh-resample-factor` option, which sets the approximate number of vertices per square millimeter in the resampled mesh. Newer versions of volume-cartographer (5abb42db and up) additionally have the `--mesh-resample-vcount` option which exactly controls the number of vertices in the output mesh. Be careful to not set the vertex count value too low, as this can modify your mesh such that it no longer intersects the object's surface.
### Fixing orientation errors
The various flattening (aka UV) algorithms available in volume-cartographer will produce flattened surfaces which are often flipped or rotated relative to what the observer would expect if they were to look at the surface in real life.
The presence of these transformations may not become known until attempting to align the reference photograph to the generated texture image.
**Textures, PPMs, and all subsequent steps should be updated to match the expected orientation when these problems are detected.**
The various flattening (aka UV) algorithms available in volume-cartographer will produce flattened surfaces which are often flipped or rotated relative to what the observer would expect if they were to look at the surface in real life. The presence of these transformations may not become known until attempting to align the reference photograph to the generated texture image. **Textures, PPMs, and all subsequent steps should be updated to match the expected orientation when these problems are detected.**
The `vc_render` application provides the `--uv-rotate` and `--uv-flip` options to adjust for these transformations. The effect of these flags can be previewed without waiting to generate a full texture image by looking at the file specified by the `--uv-plot` flag:
......@@ -319,13 +313,12 @@ vc_render -v ../../ -s 20200125113143 --output-ppm 54kv_internal_layer.ppm --uv-
Consult an expert or scholar to ensure the orientation at this stage is correct. To us CS folk, it can be easy to have text that looks correct but is actually mirrored, for example. This is the time to make sure it is oriented correctly!
## Align the reference image
### Using algorithmic registration
Using the [Landmark Picker GUI app](https://code.cs.uky.edu/seales-research/landmark-picker), generate a landmarks file which maps points in the reference photograph onto the same points in the texture image generated by the previous step.
- Load the texture image as the Fixed image and the reference photograph as the Moving image.
- Select 6+ point correspondences between these two images.
- Export the landmarks file to your working directory: `54kv_surface_layer_landmarks_ref2ppm.ldm`
......@@ -344,7 +337,6 @@ rt_apply_transform -f 54kv_surface_layer.png -m PHercParis2_Fr143r_RGB.jpg -t 54
This can be useful if you wish to align an RGB photograph to the texture image, but surface details can only be seen in an alternative channel (i.e. infrared).
### Manual registration using Photoshop's Puppet Warp
- drag files into PS
......@@ -367,15 +359,15 @@ This can be useful if you wish to align an RGB photograph to the texture image,
- when done open up smart layer (.psb), turn on ink-label visibility, and save. (this controls the way in which the smart layer appears in the main .psd file)
- go back to the main (.psd) file and turn the visibility on for both "photo" and "texture" layers. (otherwise, as a result of puppet-warping, the "photo" layer may no longer be a perfect rectangle any more)
- still on the main (.psd) file, choose File->Save a Copy, choose PNG as the format and save.
- open the saved <inklabel>.png file in Photoshop, go to image->mode and change it to "grayscale" "8-bit".
- open the saved .png file in Photoshop, go to image->mode and change it to "grayscale" "8-bit".
- repeat the previous 4 steps with photo visibility turned on (and saved) in the smart layer (.psb)
- can re-enter puppet warp and make more changes if desired by double clicking "puppet warp" under smart filters in layers dialog
- might want to read up on puppet warp documentation
## Generate ink labels
Ink labels are black-and-white images which indicate those areas of the PPM which contain ink and those which do not.
They are manually created in Photoshop using the following steps:
Ink labels are black-and-white images which indicate those areas of the PPM which contain ink and those which do not. They are manually created in Photoshop using the following steps:
* Open the aligned texture image in Photoshop.
* Use the [Quick Selection Tool](https://helpx.adobe.com/photoshop/using/making-quick-selections.html#select_with_the_quick_selection_tool) to select all regions of visible ink in the reference photograph (**Note:** By holding the Alt/Option key, you can easily switch between adding to/removing from the current selection).
* Once you are satisfied with your selection, click the `Create a new layer` button at the bottom of the Layers panel.
......@@ -391,21 +383,20 @@ They are manually created in Photoshop using the following steps:
* Select `File/Save As...` and save this image as a PNG to your working directory (e.g. `54kv_surface_layer_inklabels.png`). **Be careful not to overwrite the Photoshop file saved previously.**
* Close Photoshop but **do not** save the Photoshop file.
## Region set
For now, manually create region set .json file defining training and prediction regions.
For now, manually create region set .json file defining training and prediction regions.
## Run ML
Based on inkid documentation and examples, found in the README here. The “SLURM Jobs” section points you to documentation for running jobs using SLURM and Singularity. A prebuilt container is available here so you shouldn’t have to go through the build process yourself.
Based on inkid documentation and examples, found in the README here. The “SLURM Jobs” section points you to documentation for running jobs using SLURM and Singularity. A prebuilt container is available here so you shouldn’t have to go through the build process yourself.
## Uploading to Google Drive
## Gotchas
- **Transfer:**
- Transferring >~1TB volumes can be surprisingly difficult. Google Drive offers us unlimited space but maximum 750GB/day upload/download. This only applies when beginning to transfer a new file, so it is still possible to deal with files that are >750GB, you just only get one per day. So, it is possible to create an archive file of a set of slices (usually compressing them takes more time than it is worth) and then just transfer the one file. For some reason rclone struggles with this, and has so far failed to work on files this large for me (but it will spend a day getting to 70% before failing and starting over). I have had more luck just downloading the 1TB file through a web browser and the Google Drive web UI. Of course that is also prone to paused or canceled downloads if you aren't careful.
- Transferring >\~1TB volumes can be surprisingly difficult. Google Drive offers us unlimited space but maximum 750GB/day upload/download. This only applies when beginning to transfer a new file, so it is still possible to deal with files that are >750GB, you just only get one per day. So, it is possible to create an archive file of a set of slices (usually compressing them takes more time than it is worth) and then just transfer the one file. For some reason rclone struggles with this, and has so far failed to work on files this large for me (but it will spend a day getting to 70% before failing and starting over). I have had more luck just downloading the 1TB file through a web browser and the Google Drive web UI. Of course that is also prone to paused or canceled downloads if you aren't careful.
- **Resize/crop slices:**
- Presently, only 16-bit TIF volumes are supported by Volume Cartographer. It is tempting when dealing with a particularly large volume to convert it to 8-bit to reduce the size by half, in an attempt to fit it into memory for running `inkid`. If you do this you'll be silently reminded later (by messed up texture images) that 8-bit volumes are not supported yet.
- ImageJ can load a set of slices for visualization/processing without holding them all in memory at once (if using "Virtual Stack"). This is quite nice. Unfortunately to do any heavy lifting (resize/interpolate an entire volume) it has to load the whole thing into memory, which is not possible with large enough volumes. It can also batch process the slices as individual images, so I have tried that for cropping the slices of a volume. However, it messed with the dynamic range of the slices in some silent and unknown way. The slices looked fine individually, but had different mean brightness from each other. This was only discovered way down the line when looking at a texture image, which was marred with odd stripes to the point of being useless.
......@@ -420,28 +411,3 @@ Based on inkid documentation and examples, found in the README here. The “SLUR
- **Label (mask/image):**
- **Region set:**
- **Run ML:**
\ No newline at end of file
## Using Toggl for Time Tracking
- Login to https://toggl.com/app/.
- At the bottom of the left sidebar, select the `DRI` workspace:
![1-workspace](uploads/5314eca2e73bdad95903924cdb8ec3c7/1-workspace.png)
- Select the Timer tool from the sidebar. Enter a description for the task you are working on. Use a description that includes the dataset and stage of processing:
![2-timer+desc](uploads/378b5f7de2bf50e437a5aeb831b4508b/2-timer+desc.png)
- Select the `Summer 2020 Data Processing` project from the Project dropdown menu:
![3-project](uploads/df22503bc2c55cbd4a667e4533a0ce3e/3-project.png)
- Select the appropriate tag(s) for your task in the Tag dropdown menu:
![4-tags](uploads/7e340687e3cbb3fccd0405b5abaeee95/4-tags.png)
- Click the green Play `▶` button to begin time tracking your task.
- When you are done working on your task, click the red Stop `■` button. Your progress will be added to the tracked tasks below the Timer tool.
- To resume working on a task, click the Play `▶` button for your task in the list of tracked tasks:
![5-resume-task](uploads/b638f7d316eaadc3f3a260c3b3924c37/5-resume-task.png)