Назва посади
Що робить посада
Якому типу характеру підходить
Як виглядає типовий день
Рівні ЗП
Подальші кар'єрні перспективи
Hard Skills
Soft Skills
Вимоги в вакансіях спільні
Вимоги в вакансіях додаткові
По рівням вимоги Hard, Soft, в вакансіях
Big Data Software Engineer / Data Engineer
Linear algebra. Calculus. Statistics and Probability Theory.
Machine Learning Algorithms: regression, simulation, scenario analysis, modeling, clustering, decision trees, etc.
Python 3, Pandas, Scikit Learn, Keras, Tensor Flow, Numpy, PyTorch.
Data visualization.
Software engineering methodologies, functional programming or object-oriented programming.
DevOps: containerization and orchestration.
Classic DBs (relational or object): MySQL, PostgreSQL, RDS.
NoSQL (documented): MongoDB, Cassandra, HBase, Elasticsearch, Redis, DynamoDB.
NewSQL (hybrid/in memory): Memsql, VoltDB.
Query engines: Impala, Presto.
Cloud platforms (GCP, AWS). Cloud computation (Dataflow, Dataproc). Streaming (Pub/Sub, Kafka). Data storage (BigQuery, Cloud SQL, Cloud Spanner, Firestore, BigTable).
ETL Concepts / Processes.
Data Warehouse technologies, Data Lake architecture.
Data modeling: Bachman diagrams, Chen's Notation, Object-relational mapping, etc.
Processing frameworks: Apache Spark (Pyspark/SparkR/sparklyr), Flink, Beam, Kafka streams
Data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Data Scientist
Python (PyCharm, Pandas, NumPy, bs4, sklearn, scipy). R.
Linear algebra. Calculus. Statistics.
Machine Learning techniques (Decision Trees, Random Forest, SVM, Bayesian, XG Boost, K-Nearest Neighbors) and concepts: regression and classification, clustering, feature selection, feature engineering, the curse of dimensionality, bias-variance tradeoff, SVMs.
Data visualization.
Data Mining (Clustering, Frequent Pattern Mining, Outliers Detection).
Neural Networks and ML Packages (sklearn/sqboost/Tensorflow/Keras, H20).
Cloud platforms (GCP, AWS). Cloud computation (Dataflow, Dataproc). Streaming (Pub/Sub, Kafka). Data storage (BigQuery, Cloud SQL, Cloud Spanner, Firestore, BigTable).
Databases: SQL and non-SQL, AWS cloud storage, GDPR data privacy.
Processing frameworks: Hadoop, Spark.
Business Intelligence Software (Power BI, Tableau, Qlik, Cognos Analytics).
Machine Learning Engineer
Computer science fundamentals, algorithms, mathematics, linear algebra, probability, and statistics.
Python (Pandas, Numpy, Scikit-Learn, Tensorflow, Keras).
Python visualization tools: matplotlib/seaborn, Plotly.
Machine Learning techniques (Decision Trees, Random Forest, SVM, Bayesian, XG Boost, K-Nearest Neighbors) and concepts: regression and classification, clustering, feature selection, feature engineering, the curse of dimensionality, bias-variance tradeoff, SVMs.
Deep Learning: Recurrent Neural Network (LSTM/GRU units), Convolutional Neural Network.
Machine learning frameworks (TensorFlow, Caffe2, PyTorch, Spark ML, scikit-learn) and ML techniques: GAN, ASR, RL.
Databases: SQL and non-SQL. Hadoop ecosystem.
Processing frameworks: Apache Spark (Pyspark/SparkR/sparklyr)
Cloud platforms (GCP, AWS).
Data Analyst
Math, Statistics (regression, properties of distributions, statistical tests, and proper usage, etc.) and Probability Theory.
Statistical programming software (R, Python, SAS, Matlab).
Predictive analytics (regression models, time-series analysis and forecasting, survival or duration analysis).
BI tools: Google Data Studio / Microsoft PowerBI / Tableau.
Classic DBs: MySQL.
MS Excel.
A/B testing.
NLP Engineer / NLP Data Scientist
Python (sklearn, nltk, gensim, spacy, Tensor Flow, PyTorch, Keras) and Python Data Science toolkit: Jupyter Notebook, Pandas, Numpy, Matplotlib/Seaborn, Scipy.
Databases: SQL and NoSQL (MySQL, MongoDB, PostgreSQL ) .
NLP libraries: NLTK, SpaCy, Stanford CoreNLP etc.
NLP techniques for text representation: (TF-IDF, Word2Vec), semantic extraction, data structures and modeling.
Methods of Information Extraction (NER, terminology extraction, keywords extraction, etc.)
Machine Learning techniques and concepts (regression, trees, SVM, ensembles) for NLP tasks.
CV Engineer
Linear Algebra. Geometry. Calculus. Statistics and Probability theory.
Python3, numpy, pandas, seaborn, scipy.
Computer vision / image processing libraries such as: OpenCV, Pillow.
Convolutional Neural Networks (LSTM, inception, residual, GAN).
Neural network frameworks: TensorFlow, PyTorch.
Computer vision algorithms and architectures: object detection, segmentation, face recognition, image processing, video processing.
Real-time CV systems based on Deep Learning.
Cloud model training (GCP, AWS), Cloud integration, Cloud Platforms.
Performance metrics in object detection and classification, such as mAP and related.
Big Data (Hadoop, Spark, Hive).
Deep Learning Engineer / Deep Learning Research Engineer
Python3: numpy, scikit-learn, pandas, scipy.
Statistics (regression, properties of distributions, statistical tests, and proper usage, etc.) and probability theory.
Deep learning frameworks: Tensorflow, PyTorch; MxNet, Caffe, Keras.
Deep learning architectures: VGG, ResNet, Inception, MobileNet.
Deepnets, hyperparameter optimization, visualization, interpretation.
Machine learning models.