A Novel Toolset of Complex Machine Learning Solutions for segmented and fully distributed computing environments
The EIT Digital Doctoral School offers an Industrial Doctorate position in the field of machine learning and distributed computing. The doctoral candidate will seek to develop a novel toolset of complex machine learning solutions for segmented and fully distributed computing environments The work will be carried out with the support of EIT Digital Partners Magyar Telekom Nyrt. and the Eötvös Loránd University (ELTE), in Budapest.
The thesis aims at developing a framework integrating novel data mining and machine learning methods focusing on fully distributed solutions for segmented computing environments with special interest in distributed real-time processing, analysis and management of data. The developed toolset will be pilot tested and evaluated in various use-case scenarios (for example, precision farming) including possible utilization of models from the area of edge computing with particular focus on privacy issues and trust management.
In case of processing and analyzing complex and sensitive or region-specific data protected by some regulations, the use of usual cloud computing platforms, i.e., uploading all the data to the cloud where some kind of a "global model" is built, is challenging: First, the data cannot be moved from where they were recorded, i.e., they have to be processed locally. Second, anonymizing a complex data structure without losing any information and/or introducing noise into the data is time-consuming and domain-dependent. Finally, building one large-scale "global" model might not fit well to all the "local" data nor capture important, locally relevant patterns. In such scenarios, it would be desirable to build some kind of "local models" which are more specific, given the locally present relations and patterns in the data, and are complementary to the global model comprising of generic patterns valid for all the segmented local environments. Also, the globally valid domain-specific patterns could be utilized when building local models. It is very important that a framework suitable for the above described scenario should consider state-of-the-art privacy-preserving, real-time data processing and data analytics solutions.
The intended solution is based on integration of novel meta-learning approaches with fully distributed and privacy-preserving machine learning techniques. In meta-learning, data are represented by so-called meta-features capturing the main characteristics of data which, together with local models might be utilized to build a global model by preserving privacy in some extent since only the information about the data (the meta-features) and the learnt local models are used to build the global model instead of the data itself. Local models might be learnt in segmented/segregated clouds while the global model is built on a central cloud. For building such a computational framework, approaches and methods developed in the area of edge computing should be considered. The proposed hybrid approach lying on the boundary of the areas of meta-learning, fully distributed and privacy-preserving machine learning as well as edge computing is innovative and its development will represent a deployable solution for different domains such as precision farming or smart cities, just to name a few.
The work to be conducted is planned on three interconnected layers: The main focus, within the first layer, will be on the implementation of relevant state-of-the-art meta-learning, fully distributed and privacy-preserving machine learning techniques. On the second layer, supporting the integration of the implemented techniques, work is needed to investigate the appropriateness of various integration models in order to develop an optimal and robust framework. Third, the toolset of implemented techniques and the developed framework will be continuously tested and evaluated on various use-case scenarios, provided by our industrial partners, based on which further refinements and optimization will be provided, if needed. The development of the proposed toolset will be carried out in an agile fashion, i.e. being able to react to the newest trends in the area, since the current landscape of relevant techniques, applications and frameworks is changing rapidly.
The goal of the PhD is the development of a novel Machine Learning and Data Mining Toolset. This integrated framework of solutions will consist of fully distributed, privacy-preserving, meta machine learning techniques for segmented cloud infrastructures deployable to real-world industrial software environments. Pilots will be carried out in the software environment of the industrial partner.
The Toolset will be adaptable to various industrial software environments and to use-case applications in several domains (e.g. autonomous vehicles, 5G networks, etc.), where fully-distributed and privacy-preserving solutions are required. The potential of this generic applicability will also be explored and demonstrated.
The position is based in the Doctoral School Training centre in Budapest where a strong ecosystem for Digital Infrastructure exists.
- Industrial partner: Magyar Telekom Nyrt.
- Academic/research partner: Eötvös Loránd University (ELTE)
- DTC location: Budapest
- Number of available PhD positions: 1
- Duration: 4 years
- This PhD will be funded by EIT Digital, Ericsson, and the Eötvös Loránd University (ELTE).
The person applying for this position should be already enrolled in the Doctoral School of ELTE, and should have started his PhD studies in 2017.
Those interested in applying should send a letter of interest to Zoltán Istenes (email@example.com EIT Digital – Budapest Doctoral Training Centre Lead).
Please apply before October 15, 2017.