Saturday, April 21, 2018

mdal: model driven ai language

AI, as any other technology, matures in phases. It is a field of active study, and like COBOL or FORTRAN from 60 years ago, AI techniques today are mostly exercised by a small cohort of people (22,000 in a recent estimate [1]), but will proliferate masses (18,200,000 as of 2013 [2]) once the period of breakthroughs are over.

In this post I want to experiment with nouns, verbs and syntax structures we can expect one day in high-level declarative AI languages akin to Apache Pig or SDPL [3] for data processing today.

Most applications of AI/ML produce models. In very simplistic terms, these are numbers (weights) assigned to every node in the neural network, or numbers (weights) for regression formulas. Models also have parameters. Among them - number of inputs for neural network or features for regression.

Let's use verb LOAD to load a model and verb STORE to persist it.

For most Deep Neural Networks, models are expensive to train, and will likely be re-used as open source libs are today.
Let's use verb EXTEND for declaring new derived models and REUSE # LAYERS construct to identify depth of model reuse. Verb TRAIN will commence model training and preposition WITH will point to the data repository.

Just like TensorFlow today abstracts the computing devices [4], and like Flow [5] abstracts action from the execution cluster, we should expect our declarative language to abstract training and execution. Hence, lets add --engine to the command-line arguments.

Two nouns to complete the picture: INPUT and OUTPUT. Both need detailization: SCALE FOR to match input data to AI/ML model requirements and IN FORMAT to define data format.

In summary, let's scribble an AI program in our hypothetical MDAL language:





[4] TensorFlow dynamic placer


No comments: