Alcid Analytics LLC

Data Science and Business Consulting Services by Mike Cunha

Services

I help clients use data to achieve their goals. From data collection, processing, and analysis, to communicating the results and deploying data-driven products to production.

Machine Learning and Statistics

I can help you apply ML, deep learning, and statistics to your business processes: classification, A/B testing, prediction, recommendation, sentiment, search, natural language processing, entity resolution and more.

Dashboards and Automation

Many clients have a slow but critical reporting process that can be sped up by automating manual and Excel workflows and by making the output more versatile. Common things I deliver for these types of projects are data processing pipelines, automated reporting, and dashboards. Read More about common reasons clients hire me or take a look at an example web interface I have built.

Retainer

For some clients, I serve as an on-call partner by providing data science and data engineering technical expertise and advice to test new strategies on a regular basis, diagnose problems as they appear, and help companies that are just beginning to build out their data team.

Training

I also offer one-on-one mentoring and small group sessions on topics like data visualization, data cleaning, automated reporting, machine learning, and hypothesis testing upon request. I can give crash courses on using tools like python for data science, Jupyter, and Tableau as well.

Learn More

Contact Me and Discuss What I Can Help You With.

I am currently taking on new clients, reach out to me for a free consultation.

Projects

Some common problems and opportunities I can help you with:


Too Much

Having problems scaling an existing data-related process? Spreadsheets getting too complicated? Or maybe you have so much data you don't know where to start. I automate time-consuming repetitive tasks and help you decide which opportunities to pursue now, and in the long-term. I can also help you build tools to search through your data, whether you are an analyst bogged down with slow SQL queries or an HR specialist unable to locate a form on your intranet.

Too Messy

excel error Many businesses struggle with poor data quality, from sole proprietors to multi-national corporations. It's hard to make informed decisions when you can't get what you need, when you need it. I help clients assess data quality and how it's impacting their business, improve it with better data collection practices, and make the most of what they already have. Perfect data isn't practical; often times great value can be had from data that's just good enough.

Optimize

Some obvious questions can be very difficult to answer: What price should I charge? Which combination of products in a bundled offer will yield the most profit? Which part of my inventory can I liquidate to free up the most cash with the least risk? These are all questions I can help you answer.

Cutting Edge

Thanks to widely available open source libraries and cheap cloud computing you don't need to hire a team of deep learning experts or buy an expensive proprietary appliance to benefit from some of the unstructured data you probably already have: customer feedback, blog posts, documents, images, audio, or social media posts. Software that can automatically summarize, classify and act on unstructured data is not just for Fortune 500 companies. I also have experience using domain-specific data to produce custom models that perform better at specific tasks than the widely available machine learning APIs offered by major cloud providers.

From Reactive to Proactive

ARIMA Forecast Reporting tells you what happened, real-time reporting tells you what is happening, and forecasting tells you what could happen. By training a predictive model on historical data and conducting time-series analyses, I can help you make informed decisions before your normal reporting would usually be available.

Customer Insight

Some decisions on how to interact with your customers have to be made, regardless of available data quality. Validating the assumptions you make about your customers can be critical. I can help you learn more about your customers and validate the assumptions you have already made. I have experience analyzing unstructured text like customer reviews, chat transcripts, and comments; mining log files for pain points and buying patterns; segmenting them into personas; scoring leads; modeling churn and optimizing retention.

Siloes

Each department has its own database, for its own purposes, often undocumented outside of that department. How do you know if the 'John Smith' in your marketing database is the same customer as the 'John Smith' in your sales database? I have experience partnering with in-house teams to assist in combining disparate schemas, building data dictionaries, ETLs, and matching and merging millions of contact records.

Guidance

Vendor Logos There has been an explosion of data-related vendors and services in the past few years, and likewise, there is no shortage of salespeople eagerly telling you how you're a perfect fit for their one-click solution. I provide clients with technical advice that allows them to see through the hype and find the right tools and services for their business. I also have worked with several smaller customers looking for advice on their fist data science hires and efforts.

Apps

For clients that don't have access to enterprise-level tools I have built custom lightweight solutions. Contact me for access to demo.alcidanalytics.com to see an example of a secure web interface to an automated data pipeline along with a d3.js powered interactive dashboard.


Dashboards

Custom, interactive, in the browser. Click the GIF below to see an example.

Animated GIF showing an Alcid Analytics interactive dashboard 
                consisting of a bar chart, donut chart, line chart and several subtotals. 
                As a user clicks on items in the charts it, and the other visualizations 
                and totals, are being updated and filtered in real-time.

Run ETLs

Customized, easy to use interfaces for running complex automated processes.

Animated GIF showing a user navigate through a multi-step web form, 
                submit the form, and then receiving status updates from the automated 
                process the form submission triggered. Demonstrating an easy to use web 
                interface to automated processes.

Responsive

Consistent interface across screens allows clients to access the solutions I build on almost any device.

Secure

Apps adhere to security best practices and can range from being public facing to a private extension of a client's intranet over VPN, complete with role-based user management.

Live Editing

Display an Excel-like view of data underlying a dashboard and edit it in-place.

Animated screen capture showing a user 
                click on tabular data in a table and update a value.

Other Deliverables

Not every project needs an app or a web interface, there are one-off analyses, python packages, productionized models, and other software too.

Notebooks

Jupyter notebooks allow me to show a client exactly how I performed each step of an analysis and tailor the explanation to the client's technical ability. It is an important tool I use to make data accessible to a wide variety of teams within a client's organization. Decision makers can use them to see specific results and visualizations, while hiding complex details. The detail is easily revealed allowing the client's in-house team to quickly verify the results; even one-off analyses need to be reproducible and will come version controlled with documentation.

Code

For projects that involve more software engineering effort than one-off analyses such as real-time recommenders, classifiers, and ETL’s, I like to adhere to additional best practices to ensure the deliverables are production-ready and maintainable even if I’m not the one doing the deploy. Tests, doc-strings, consistent formatting, logging, and refactoring code into pip installable python packages where appropriate are the norm. Code Sample

Articles

Latest posts and guides from the Alcid blog.

Making a Geographic Heatmap with Python

How to make an interactive geographic heatmap using Python and free tools. This example uses Folium, a Python wrapper for leaflet.js maps and geopandas. Read More


About Me

I am a data scientist based out of the Rochester, NY area working as a freelance consultant since 2015 and in data science since 2013. I have a MSc in Natural Resource Management from HSU and a BS in Biology from SUNY ESF. A seabird biologist turned data scientist, I am most interested in applied machine learning, NLP, and data science ethics. The following is a summary of my experience, skills, and some of the tools I'm comfortable using:

Mike Cunha
Natural Language Processing
Collection and analysis of unstructured texts including: parsing, tagging, categorizing, topic modeling, sentiment, language modeling, word embeddings, chatbots, and scraping. Gensim, spaCy, Beautiful Soup, Selenium, and deep learning.
Machine Learning, Data Mining, Deep Learning, "AI"
I have used Pytorch, keras, sci-kit learn, statsmodels, and other libraries for classification, clustering, prediction, optimization, forecasting, and recommendation.
SQL
I have been writing SQL on and off for the past 8+ years and worked with many popular databases like SQL Server, Oracle, MYSQL, Postgres, BigQuery, Teradata, Elasticsearch, and MongoDB.
Data Munging and EDA
When I have a choice, I tend to use Unix command-line tools and Pandas in Python with Jupyter notebooks before refactoring into standalone .py files and modules. Dask and PySpark when things get too slow.
Visualization
Tableau; Matplotlib, Seaborn, bokeh, Chartify, geopandas for Python; ggplot2 for R; rarely d3.js, and leaflet.
Development and Operations
First and foremost: Python. Also when I have to, javascript, HTML/CSS, PHP, and R. I am very comfortable at a Unix command-line, Amazon + Google Cloud services, linux, Apache, Nginx, iis, proficient in Git, and a quick study. I'm also comfortable using Luigi, bash scripting, Ansible, flask, App Engine, GCS, S3, EC2, GCS and others!
Web Analytics
Google Analytics, Conversion Rate Optimization via A/B testing and Multi-armed Bandits, custom tracking pixels, Google Tag Manager, many more...

Contact Me

Your message was sent, thank you!