Wiki (Help)

Processing

S3 drive processing with jupyter workspace

Procedure for using the new S3 mount on a workspace.

When creating a new workspace or restarting an existing workspace, a menu with the contents of the S3 data is displayed.

All catalog data is stored in a folder :

In the catalog-data folder, two sub-folders appear: Campaign_data with data by campaign (biosar, afrisar, tropisar, user data, etc.)

And another sub-folder with esacci data.

In the S3 drive, there is also the user-data folder where user data are listed by theme (biopal, SARSIM, Brix2, etc.)

These data are hosted on the platform and mounted as read-only:
it can be accessed with linux commands (because the S3 drive is a linux mount) and the data can be viewed by downloading it.
For this purpose a dedicated notebook with the main commands as input (path S3 drive) has been created (present in the directory /demo-scripts/mmap-S3/script-s3_stream_direct.ipynb).

Open directly the notebook and choose its data (the associated path to be informed in input), it is then enough to launch the various commands according to the needs for use.

The different commands that can be used are the following:

  • !ls -sh /s3-drive path/*.tiff

(list the data and view their size)

  • !time gdalinfo s3-drive path/*.tiff

(get the information associated to the selected data)

  • !mkdir /folder name/

(create a new folder to copy the data you want to retrieve)

  • !time cp s3-drive path/*.tiff /folder name/.

(copy a data in the file previously created)

  • procedure compute data

(to display data and replace maap-s3 download script)

It allows to launch linux commands to list the selected data in the catalog or listed user data, to consult the content of these data and to copy them locally or just consult them from internet.
The data download is now done with the S3 mount, no need to run the "download" of the notebook maap s3 script, you can choose to display any data directly from your workspace.

The usage of the cache is limited to the files lower than 2.5GB.

0 Comments