This was my first time speaking at a public event and I was terribly nervous. Even got through it in half the time!
Data Science is a hot topic in boardrooms right now. Everybody wants to adopt AI/ML, hire the best and brightest data scientists, and enable them to experiment and build intelligent applications. New libraries have made it possible to analyze new types of data and even gain new insights from historical data. Massive amounts of data being generated from the boom in IoT computing mean there’s even more demand for ML aggregation at the edge. Everybody wants in.
Are you a data scientist, engineer, or researcher, just getting into distributed processing and PySpark, and you want to run some of the fancy new Python libraries you've heard about, like MatPlotLib?
If so, you may have noticed that it's not as simple as installing it on your local machine and submitting jobs to the cluster. In order for the Spark executors to access these libraries, they have to live on each of the Spark worker nodes.
You could go through and manually install each of these environments using pip, but maybe you also want the ability to use multiple versions of Python or other libraries like pandas? Maybe you also want to allow other colleagues to specify their own environments and combinations?
If this is the case, then you should be looking toward using condas to provide specialized and personalized Python configurations that are accessible to Python programs. Conda is a tool to keep track of conda packages and tarball files containing Python (or other) libraries and to maintain the dependencies between packages and the platform.
Microservices are simple, single-purpose applications that work in unison via lightweight communications, such as data streams. They allow you to more easily manage segmented efforts to build, integrate, and coordinate your applications in ways that have traditionally been impossible with monolithic applications.
Rachel Silver is the Product Management Lead for Machine Learning & AI @ MapR Data Technologies.