How to choose the right algorithm for the right task?

Nowadays, a wide variety of algorithms is available in different data analytics libraries and toolkits. Therefore, when it comes to choosing an algorithm, the question is not whether an algorithm exists to solve your problem, but rather which one best fits the data science problem you are trying to solve. The next session of the EluciDATA mastercourse on 26 April will focus on choosing the right algorithm for the right task.

While today, attention mainly goes to examples by major internet companies (e.g. Google, Amazon, Facebook), data science can also be very valuable to innovate within other industrial domains as well as within SMEs. It enables one to derive new insights from experimental data, to profile products and customers, to optimise production processes, to predict failure of machines, etc.

In this context, the EluciDATA mastercourse provides pragmatic and industry-oriented sessions on data-driven innovation. This mastercourse (in English) is composed of several independent sessions in order to accommodate for the different needs and viewpoints of people with different backgrounds.

Choosing the right algorithm for the right task

One of the final and central steps in the data science workflow is the choice of an appropriate algorithm for the problem you are trying to solve. Due to the wealth of algorithms included in data analytics libraries and toolkits, the question is often not if there is an algorithm for the setting at hand, but rather which one is most fit. In addition, the way you formulate your business objective as a data science task can determine the type of algorithm you can apply.

Therefore, the goal of this session is to introduce the participants to the most important data science tasks (classification, clustering, regression, etc.) and provide an overview of the most commonly used algorithms and techniques to solve each of these tasks. For each of the methods, its characteristics, advantages and disadvantages will be explained in order to guide the participants in making a conscious choice in terms of the available data (dimensionality, attribute types, etc.) and the expected model requirements (interpretability, accuracy, scalability, etc.). Finally, the guiding principles to train and evaluate the resulting models, including an overview of common pitfalls and frequently-used evaluation measures, will be presented.

In this session, the following questions will be answered:

  • How to translate your business objective(s) to a data science task?
  • What are the most important data science tasks, and which machine learning algorithms and techniques exist to solve these tasks?
  • How to choose the appropriate algorithm based on important characteristics of the available data and expected model requirements such as accuracy, interpretability, scalability, etc.?
  • How to train and evaluate the resulting models, in order to arrive at the most optimal performance?

Would you like to attend this session? Check out the seminar details in our agenda.