View on GitHub

About-CK

Articles about CK

CK: Collective Knowledge

This blog post gives a brief introduction into CK and its basic concepts. There is a ton of existing documentation out there in the CK wiki on GitHub. All of this documentation can easily feel overwhelming. This is why I wrote this deliberately short and lightweight introduction into some of the fundamental basic concepts of CK, which helped me a lot in understanding CK.

I assume that you have the CK tool installed on your machine, which you can easily check by running ck version. If this returns an error you want to install CK by running pip install ck 1.

So what is CK?

To put it quite generic, CK is a tool which helps organise and work with stuff you care about. Stuff can be a lot of different things, such as research data, programs or scripts analysing this data, as well as the resulting data obtained by the analysis – just to give a typical research workflow as an example.

CK helps you to organise this stuff by assigning unique identifiers (so called ‘UIDs’) to every entry registered with ck. Entries are stored in repositories which facilitate sharing. A special type of entries are modules which implement the functionality of CK. CK comes with a set of built-in modules, but you can also write custom modules yourself.

Entries, repositories, and modules are the basic vocabulary of CK. Let’s start talking more about them.

CK Entries

CK tracks entities by assigning them unique identifiers. Each entry is stored in a separate directory and CK also stores additional metadata in form of a couple of JSON files for each entry. These file are stored in the .cm subdirectory of the entry. There are three metadata files:

CK Repositories

In CK a repository is a collection of entries which are meant to be shared with other people. CK uses a tool called git which makes it incredible easy to share repositories among team members or make them publicly available. Websites such as GitHub or Bitbucket can be used to host CK repositories online.

Ck stores all of the repositories in one central folder. On linux and macOS this is by default: $HOME/CK_REPOS.

CK Modules

Modules in CK group entries as well as actions to operate on these entries. CK entries which are operated on by a particular module are put in a directory which has the same name as the module. For example:

This leads to a familiar directory structure where at the top-level directories are called after CK modules, e.g., program, dataset, and experiment. At the second-level directories store the actual programs, datasets, and experiments you care about, e.g., program/my-awesome-program, dataset/my-awesome-dataset, and experiment/my-awesome-experiment. These are themselves CK entries with their own metadata and UIDs.

Actions in CK are functionalities offered by modules to operate on CK entries. Let’s make a few concrete examples:

Every command line in CK has the same basic form to perform an action of a particular module:

ck action module

Therefore, we write: ck compile program, ck add_file_to dataset, ck rerun experiment, and so on.

This style is deliberately designed so that the commands read like sentences. I call this ck action module structure the grammar of CK.

CK commands which talk about particular entries specify them by using the following notation:

ck action module:entry

Sometimes it is required to help CK distinguish between entries in different repositories. In these cases we have to write:

ck action repository:module:entry

Many modules allow to specify additional options as command line flags. You can get a full list of supported actions by calling on a particular module:

ck help module

CK modules for managing repositories and modules

There exists CK modules for managing repositories and modules themselves. These are called repo and module and are briefly described here.

repo

Repositories are a central concept in CK (as we have seen above) which are managed by the repo module.

Here are some things one can do with this module:

There are a number of things one can do with a particular repository. We take the ck-autotuning repository as an example:

module

Modules are managed by a module called module.

Similarly to the actions on repositories one can:

To list only the modules of a particular repository, for example ck-autotuning one can execute:

ck list module --repo_uoa=ck-autotuning

The --repo_uoa=ck-autotuning part is an input argument passed to the list action of the module module. To list all the possible input arguments of an action call: ck action module --help. So for example: ck list module --help. This will print a description of the action and which input arguments it will process and what output it will return.

Common CK actions

There are some actions which can be used on every module. These are called common actions. You can list all common actions by running: ck help.

Furthermore, you can always call ck action module --help to get learn about the input arguments and return values of an action.

Many of the common actions are for managing ck entries, the most important of them are:

Where to go from here?

I only scratched the surface of CK. I haven’t talked about the meta data format (which is JSON) and the implementation of your own custom modules (which is commonly done in Python).

As I said in the beginning, there is plenty of documentation available on the CK wiki. It is incredible useful to keep the vocabulary (entries, repositories, modules) and the grammar (ck action module) of CK in mind while reading these documents and start playing around with CK.

The two most appropriate starting points are the Getting Started Guide and the Portable Workflows page.

For seeing how to implement you own workflow with CK following an example, read the Getting Started Guide.

For learning how to implement portable workflows with CK, by

read the corresponding sections in the Portable Workflows page.

Also, ask questions on the CK mailing list. The community is very much open to answer your questions!

  1. If you have troubles installing CK this way you find more information in the CK wiki