For one project at a European Institution in Belgium, we’re currently looking for a Full Remote (Senior) Data Scientist.
Candidates need to be based in Europe and have a workpermit. EU Candidates need to be fluent in English. This position is open for employees and freelancers.
Tasks and responsibilities:
Development and maintenance of software applications in the field of Natural Language Processing (NLP), Machine Learning (ML) and/or Artificial Intelligence (AI);
Training of custom machine learning / deep learning models based on structured and unstructured data;
Selecting features, building and optimizing classifiers using machine learning techniques;
Follow studies and developments aiming at improving the quality of machine translation (MT) engines for each installed language pair;
Interact with data stewards and other IT stakeholders to define the data rules;
Creating automated anomaly detection systems and constant tracking of its performance;
Data mining using state-of-the-art methods;
Processing, cleansing, and verifying the integrity of data used for analysis;
Design the IT architecture for solutions in the NLP / ML / AI fields, and coordinate its implementation considering master- and meta-data management concepts;
Profile:
Master Degree in a related field with 11 years of professional experience in IT.
+3 years of professional experience in the domain as a Data Scientist;
Experience in Machine Learning and Natural Language Processing;
Excellent knowledge of Perl, Python, Matlab, R and its NLP/ML libraries (SpaCy, NLTK, scikitlearn, pandas);
Good knowledge of SQL tooling (NoSQL DB, MongoDB, Hadoop, SQL);
Knowledge of query languages, such as SQL, Hive, Pig, etc and experience with information extraction;
Knowledge of NoSQL databases, such as MongoDB, Cassandra, HBase, etc;
Knowledge of data visualisation tools, such as D3.js, GGplot, etc;
Good knowledge of AWS and/or Azure;
Good knowledge of Linux;
Good knowledge of Unix and Bash;
Excellent knowledge of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, Neural Network, and/or artificial intelligence frameworks.
Knowledge in one of the following areas: predictive (forecasting, recommendation), prescriptive (simulation), sentiment analysis, topic detection, social media crawling and processing, plagiarism detection, trends/anomalies detection in datasets, recommendation systems.
Experience in the field of corpus based linguistics;
Experience with alignment models and classification methods;
Good knowledge of natural language processing systems lifecycle and agile software development methodologies;