2020
Hardware accelerator for executing a computation task
D. Diamantopoulos, F. Scheidegger, C. Malossi, C. Hagleitner, C. Bekas.
US11175957B1
Abstract
The present disclosure relates to a hardware accelerator for executing a computation task composed of a set of operations. The hardware accelerator comprises a controller and a set of computation units. Each computation unit of the set of computation units is configured to receive input data of an operation of the set of operations and to perform the operation, wherein the input data is represented with a distinct bit length associated with each computation unit. The controller is configured to receive the input data represented with a certain bit length of the bit lengths and to select one of the set of computation units that can deliver a valid result and that is associated with a bit length smaller than or equal to the certain bit length.
Using meta-learning to optimize automatic selection of machine learning pipelines
D. Wang, C. Gan, G. Bramble, L. Amini, H. Samulowitz, K. Kate, B. Chen, M. Wistuba, A. Evfimievski, I. Katsis, Y. Li, C. Malossi, A. Bartezzaghi, B. Kawas, S. Gurajada, L. Popa, T. Pedapati, A. Gray
US20220051049A1
Abstract
A computer automatically selects a machine learning model pipeline using a meta-learning machine learning model. The computer receives ground truth data and pipeline preference metadata. The computer determines a group of pipelines appropriate for the ground truth data, and each of the pipelines includes an algorithm. The pipelines may include data preprocessing routines. The computer generates hyperparameter sets for the pipelines. The computer applies preprocessing routines to ground truth data to generate a group of preprocessed sets of said ground truth data and ranks hyperparameter set performance for each pipeline to establish a preferred set of hyperparameters for each of pipeline. The computer selects favored data features and applies each of the pipelines, with associated sets of preferred hyperparameters, to score the favored data features of the preprocessed ground truth data. The computer ranks pipeline performance and selects a candidate pipeline according to the ranking.
2018
Automatic determination of cognitive models for deployment at computerized devices having various hardware constraints
Scheidegger Florian M., Istrate Roxana, Mariani Giovanni, Bekas Konstantinos, Malossi A. Cristiano I.
US20200193266A1
Abstract
Determining cognitive models to be deployed at auxiliary devices may include maintaining relations, e.g., in a database. The relations map hardware characteristics of auxiliary devices and example datasets to cognitive models. Cognitive models are determined for auxiliary devices, based on said relations, e.g., for each of the auxiliary devices. An input dataset is accessed, which comprises data of interest, e.g., collected at a core computing system (CCS), and hardware characteristics of each of the auxiliary devices. An auxiliary cognitive model is determined based on a core cognitive model run on the input dataset accessed, wherein the core cognitive model has been trained to learn at least part of said relations. Parameters of the auxiliary model determined can be communicated to said each of the auxiliary devices for the latter to deploy the auxiliary model determined. Method may be implemented in a network having an edge computing architecture.
Creating optimized machine-learning models
Thomas Gegi, Malossi A. Cristiano I., Pedapati Tejaswini, Venkataraman Ganesh, Istrate Roxana, Wistuba Martin, Scheidegger Florian M., Xue Chao, Yan Rong, Samulowitz Horst C., Herta Benjamin, Saha Debashish, Strobelt Hendrik
US20200184380A1
Abstract
A machine-learning model generation method, system, and computer program product deciding, via a first algorithm, a machine-learning algorithm that is best for customer data, invoking the machine-learning algorithm to train a neural network model with the customer data, analyzing the neural network model produced by the training for an accuracy, and improving the accuracy by iteratively repeating the training of the neural network model until a customer-defined constraint is met, as determined by the first algorithm.
2017
Higher accuracy of non-volatile memory-based vector multiplication
Bekas Konstantinos, Curioni Alessandro, Eleftheriou Evangelos Stavros, Le Gallo-Bourdeau Manuel, Malossi A. Cristiano I., Sebastian Abu
US10614150B2
Abstract
A multiplication device for performing a matrix-vector-multiplication may be provided. The multiplication device comprises a memristive crossbar array comprising a plurality of memristive devices. The device comprises a decomposition unit adapted for decomposing a matrix into a partial sum of multiple sub-matrices, and decomposing a vector into a sum of multiple sub-vectors, a programming unit adapted for programming the plurality of the memristive devices with values representing elements of the sub-matrices such that each one of the memristive devices corresponds to one of the elements of the sub-matrices, an applying unit adapted for applying elements of one of the multiple sub-vectors as input values to the memristive crossbar array to input lines of the memristive crossbar array resulting in partial results at output lines of the memristive crossbar array, and a summing unit adapted for scaling and summing the partial results building the product of the matrix and the vector.
2016
Logarithm and Power (Exponentiation) Computations Using Modern Computer Architectures
Bekas Konstantinos, Curioni Alessandro, Ineichen Yves G, Malossi A. Cristiano I.
US Patent App. 15/138,846
Abstract
Embodiments of the present invention may provide the capability to evaluate logarithm and power (exponentiation) functions using either hardware specific instructions, or a hardware specific implementation with reduced memory requirements. An input comprising a floating point representation of a real number may be received and a mantissa and an exponent may be extracted. A function of a logarithm of a mantissa of the real number may be approximated by utilizing a polynomial based on the mantissa. The approximated function of the logarithm may be combined with the exponent for calculating a value comprising a logarithm of the real number. Likewise, an input comprising a floating point representation of a real number and a representation of a second number may be received and an approximation of the real number to the power of the second number may be generated.
2015
Fast, energy-efficient exponential computations in simd architectures
Bekas, Konstantinos and Curioni, Alessandro and Ineichen, Yves and Malossi, A. Cristiano I.
US Patent App. 14/745,499
Abstract
In one embodiment, a computer-implemented method includes receiving as input a value of a variable x and receiving as input a degree n of a polynomial function being used to evaluate an exponential function e^x. A first expression A*(x-ln(2)*K_n(x_f))+B is evaluated, by one or more computer processors in a single instruction multiple data (SIMD) architecture, as an integer and is read as a double. In the first expression, K_n(x_f) is a polynomial function of the degree n, xf is a fractional part of x/ln(2), A=2^52/ln(2), and B=1023*2^52. The result of reading the first expression as a double is returned as the value of the exponential function with respect to the variable x.