How to Import and Export Data

Export

Users with administrator privileges can export the data on their profile page. This is reachable via the user name link in the header.

_images/profile.png

The Export All link will start a download of the data as JSON. The format contains one JSON object per line. This is because it is much easier to parse it item by item then the whole thing at once. Use the save as function of your browser to save a copy on your disk.

Import/Upload

There are three ways of importing Data, you may use the Bulk Import features or the Import or Upload single Entry . Both are reachable vie the Upload Data Button on the profile Page.

Import or Upload single Entry

On the profile bage you will find the Upload Data Button, which will lead you to an upload form. After Uploading the file you can select the Single Json Option which will open an Editor Form to edit and commit the file to the database.

Bulk Import

You can import small junks of new data into the database by using the Upload Data but as this will make it hard to track the changes. It is recommended to reset the database first and re-import a consistent data set.

You can upload Scidex SDF data or the internal JSON lines format from an Export All download. You will have to chose which one it is, before the next step, which is checking the data. This will take a few minutes. In case of JSON lines format it will check the data for conformity with the current schema and in case of Scidex SDF it will parse the SDF to extract the data.

During this step, you will see the number of items already processed end eventual warnings and error messages. The warnings will include if no INCHI could be computed from the SDF, which is the case, if there are any special atoms in it like “*” or “R”. If no SMILES code could be computed the structure in question is seriously non conform. But entries with no atoms and bonds in the structure part, will also trigger this error. Do not close the window and wait until you are being offered to commit the data.

Committing will send the data to the database and the import is done. It remains to wait until the changes are processed and the new version number shows up in the header.

Reset and Import

The Delete All Data resets the database. Before you do such a thing make sure you did export and save a backup of the current state before you delete anything.

If you deleted the data, you will recognize that you still can see and search the database after you deleted it which is confusing. This is because what really happened is, we reseted the version number to 0, and deleted the CouchDB database.

This means as soon as you import any new data, the version of the CouchDB database changes and it will trigger a rebuild of the search tables. The import of the new data is complete if the version number in the header is the same as the CouchDB version. During the time the search tables are rebuilding the version field will show both versions as searchtable_version/CouchDB_version

Version Numbers

The version number comes from the CouchDB version system. As it is an append only data store, any document added removed or changed will increase the version by one. Version 0 therefore must be an empty database.