Thursday, January 07, 2021

cyber-security extension boards

2020 is a year when populace recognized their computational hunger was bigger than ever before: some to play Cyberpunk 2077, some to mine cryptocurrencies and yet others to train and infer ML models and Deep Neural Networks.

Intel is probably the biggest disappointment of the last decade - the x86 CPU has not improved with an exception of adding few cores. But even those additional cycles are so anemic they can barely catch up with the Javascript engines.

It would not be an exaggeration to say that every computer enthusiast harbours silent hopes for at least 100x performance improvement over the next few years, to let’s say 3000 TFlops (which of course will take much longer).

Today, our laptops are powerful enough to play video and do text editing. However user can unmistakably tell when an antivirus or system scan kicks in. All in the time when 5 digital records are stolen per living person [1] to an impressive toll of 33*10^9 records annually.

Cybersecurity systems of tomorrow will almost certainly perform pattern recognition (binaries and process memory pages), as pioneered by Deep Instinct [2], and saturate Disk IO and Processors for inference. Such services however would require hardened hardware security of its own.

And that in turn means that in addition to buying GPUs and Disks, in few years, users will likely be purchasing cyber-security extension boards. Such board will come with its own ML Processor and wide memory channel, and most likely with a subscription to a cyber-security provider.

https://static.tweaktown.com/news/6/7/67383_01_legendary-3dfx-voodoo-5-6000-shown-4-way-sli-single-board.jpg

Cheers!

[1] https://www.juniperresearch.com/press/press-releases/cybersecurity-breaches-to-result-in-over-146-bn

[2] https://www.deepinstinct.com/



Friday, July 03, 2020

time to make computing personal again

Historically, we have been good at choosing right tool for the job:

 homestead
 vs commercial
 
 fishing
 
 
grass cutting
 
 
 shape adjustment
 

In the examples above at least three factors stand out: cost of acquisition, cost of maintenance and operational complexity. While choosing tools for home, I prefer those with maximum ease of use, and in many cases I save for months or even years to get the desired one.

However this approach does not hold in computer engineering - the prevalent trend seems to be:

the bigger, costlier, complexier - the better

We can use LOC (lines of code) as a measurement of complexity and see how everyday software fare:

 lines of code
reference
 Windows 50 000 000 [4]
 Linux kernel
 27 800 000
 [2]
 Firefox21 000 000
 [7]
 Docker10 000 000
 [8]
 Libre Office
 9 500 000
 [3]
 Kubernetes1 510 000
 [9]
 Gnome 945 000
 [6]
 OpenShot 715 000
 [5]
 VLC685 000
 [10]
 Kafka470 000
 [13]
 Krusader57 000
 [11]

But why should we complain if software works, and works well? After-all, not many people can work on their cars. Selected few can change oil or brake pads, and everything else is so computerized and specialized that requires a certified mechanic.

This is exactly the problem I believe we should avoid in computer engineering. Creative part of our community was founded on four essential freedoms: (0) to run the program, (1) to study and change the program in source code form, (2) to redistribute exact copies, and (3) to distribute modified versions [12].

However the sheer size of modern software strikethru this philosophy, as only standing teams could master anything above 500K LOC. That effectively aligns most large projects with corporate interest. The larger the source code base, the more important is the corporate support, the less it is tinker-friendly.

The Linux community:
A corporate controlled committee of people
who don't use Linux and dislike ideas
 Bryan Lunduke [19]

Don't get me wrong - it is great that Kafka or Linux are open source software. However I have serious doubts whether homed or self-balancing brokers aid personal use.

Should we perhaps triage existing open source software projects and label them accordingly to their complexity and intended use: corp-only, home-friendly and tinker-friendly? And if in the process we classify existing Linux Kernel as corp-only, should we then look to develop home-friendly open source operating system?

Let's take this thought a step further - given current hardware trends, we can expect two 16 core servers per homestead at the turn of the decade. Do we really expect people to learn Kubernetes to utilize all this hardware? Let's just enumerate required concepts to use the latter: deployment, replica-set, pod, label, rolling update, health check, environmental variables, secrets, resource management, horizontal pod autoscaler, namespace, service, ingress, annotations, persistent volumes, jobs, config map, etc. Security is extra.

That is on top of: computer science data structures & algorithms, file system, memory tables, CPU interrupts, network & protocols & routing, programming languages, compilers, GUI.

De-facto we require an engineer to master two types of operating systems: single-node and multi-node before even conceiving something fun!

Note - business skills, such as data processing, statistics or deep learning, is packaged and sold separately.

No one knows the whole kernel
Linus Torvalds [1]

It is overwhelming for the current generation of engineers, many of whom have learned basics tinkering Comodores, Ataris or ZX Spectrums (whose Basic had 88 keywords [18], of which I probably used 40). But how surmountable is it for the next generation, born to touchscreen and "cloud"? Good approximation could be that Google with 2 billion lines of code [15] has arguably not innovated in the past decade [14], while constantly producing ever-more scalable and performant infrastructure.

I think if we want next generations of enthusiast and innovators to push boundaries of innovations and discoveries, to explore computational universe and cross the isle to genetics, jurisprudence and other disciplines, we need to scale down the software projects complexity.

We don't need combined 38 million lines of code between Linux Kernel & Docker & Kubernetes if we can use Plan9 at 2 million mark [16]. Yes, it has no corporate appeal for now, but when it does, we need to protect it from growing into Linux-2.

It is quite possible that a non-profit foundation like FSF [17] needs to setup a cyber-preserve where projects are maintained from overgrowing certain complexity threshold, and mandate forking & name change when it happens. We also probably need foundation-driven distros, providing software &/| hardware for home use that could replace majority of cloud services with a local ones. For instance:

os -> plan9 + gnu tools
system programming language -> go
database -> pgsql, rocksdb
google search -> computational knowledge engine
photo, audio, video -> kodi
video editing -> openshot
image editing -> krita, gimp
office -> libre office
browser -> ?

And for that to happen, we likely need to come up with a FSF funding plan.

Cheers!


[1] https://www.zdnet.com/article/even-linus-torvalds-doesnt-completely-understand-the-linux-kernel/

[2] https://www.linux.com/news/linux-in-2020-27-8-million-lines-of-code-in-the-kernel-1-3-million-in-systemd/

[3] https://www.openhub.net/p/libreoffice

[4] https://www.wired.com/2015/09/google-2-billion-lines-codeand-one-place/

[5] https://www.openhub.net/p/openshot-video-editor

[6] https://www.openhub.net/p/gnome

[7] https://hacks.mozilla.org/2020/04/code-quality-tools-at-mozilla/

[8] https://www.openhub.net/p/docker

[9] https://www.openhub.net/p/kubernetes

[10] https://www.openhub.net/p/vlc

[11] https://www.openhub.net/p/krusader

[12] https://www.gnu.org/philosophy/philosophy.html

[13] https://www.openhub.net/p/apache-kafka

[14] https://secondbreakfast.co/google-blew-a-ten-year-lead

[15] https://www.wired.com/2015/09/google-2-billion-lines-codeand-one-place/

[16] https://www.openhub.net/p/plan9

[17] https://www.fsf.org

[18] https://en.wikipedia.org/wiki/Sinclair_BASIC

[19] https://www.youtube.com/watch?v=cZN5n6C9gM4

Monday, July 23, 2018

military ai ban = void ban

Recent headlines on banning AI from military use are reminiscent of 1968 Non Proliferation Treaty [1] - an elegant document that have not prevented score of countries such as India, Pakistan, Israel or North Korea to develop nuclear weapons.

Today we hear about "public pledge" [2] to not develop military applicable AI. Some call for total ban. But just how effective or even moral is such ban? For it to be effective, it must be adhered to by all capable parties. For instance nuclear development require large facilities visible from space, large groups of scientists and engineers and takes decades to complete. Such massive efforts are impossible without a central government. Due to the scale and duration of the program there is time for politicians to intervene and negotiate.

Not so much with the military AI capability. It is completely asymmetrical. I would estimate a group of 4 software engineers, 2 electrical engineers and 2 mechanical engineers to be able to produce killings machines on a mass scale.

Several publications have appeared over the years [3] [4] linking engineering degrees with terrorism. It is an easy correlation, but at the heart of it is the gentle learning curve to very powerful technologies.

For instance, a computer vision neural network in combination with Vertical Take-off and Landing (VTOL) and some military payload (whether it is a rifle, an explosive, a gas or a biological agent) will likely satisfy wide spectrum of applications to destabilize the society, such as political assassinations or targeting ethnic groups.

It is obvious that rogue killing drones are a matter of years, if not months. As a society we must understand that there is no easy or cheap solution to the proliferation of the killing machines. Banning can restrain official law enforcement/military from obtaining such capability, impacting their ability to react to rogue/terrorist technologies. In other words - with the ban in effect, it will be front-line officers fighting the rogue robots in the streets of our cities.

It is likely that surveillance could be a short-term tool to boost the security. However in the long run, it is the development of the "good" military/police AI that could restore the balance.

Cheers!

[1] https://en.wikipedia.org/wiki/Treaty_on_the_Non-Proliferation_of_Nuclear_Weapons

[2] https://www.independent.co.uk/life-style/gadgets-and-tech/news/elon-musk-killer-robot-artificial-intelligence-pledge-military-drone-a8453611.html


[3] https://science.slashdot.org/story/15/11/25/1326242/engineers-nine-times-more-likely-than-expected-to-become-terrorists



[4] https://news.slashdot.org/story/09/12/30/1318240/why-do-so-many-terrorists-have-engineering-degrees