Diseña un sitio como este con WordPress.com

COR-GAN: Correlation-Capturing Convolutional Neural Networks for Generating Synthetic Healthcare Records


Deep learning models have demonstrated high-quality performance in areas such as image classification and speech processing. However, creating a deep learning model using electronic health record (EHR) data, requires addressing particular privacy challenges that are unique to researchers in this domain. This matter focuses attention on generating realistic synthetic data while ensuring privacy. In this paper, we propose a novel framework called correlation-capturing Generative Adversarial Network (corGAN), to generate synthetic healthcare records. In corGAN we utilize Convolutional Neural Networks to capture the correlations between adjacent medical features in the data representation space by combining Convolutional Generative Adversarial Networks and Convolutional Autoencoders. To demonstrate the model fidelity, we show that corGAN generates synthetic data with performance similar to that of real data in various Machine Learning settings such as classification and prediction. We also give a privacy assessment and report on statistical analysis regarding realistic characteristics of the synthetic data. The software of this work is open-source and is available at: this https URL.

Anuncio publicitario

2019-09 Andrew Ng at Amazon re:MARS 2019

In eras of technological disruption, leadership matters.

 Andrew Ng speaks about the progress of AI, how to accelerate AI adoption, and what’s around the corner for AI at Amazon re:MARS 2019 in Las Vegas, California.


*****, AI  for Good, AI FATE (fairness accuracy transparency ethics), AI Forecast, AI Techology advance, AI Training, Business, Data, PersonOfInterest,

2019-08 Waymo is going to share its self-driving data—but it’s still not enough

Waymo says it will share some of the data it’s gathered from its vehicles for free so other researchers working on autonomous driving can use it. Waymo isn’t the first to do this: Lyft, Argo AI, and other firms have already open-sourced some data sets. But Waymo’s move is notable because its vehicles have covered millions of miles on roads already.


2017-09 Dealing With Imbalanced Datasets

Dealing with imbalanced datasets is an everyday problem. SMOTE, Synthetic Minority Oversampling TEchnique and its variants are techniques for solving this problem through oversampling that have recently become a very popular way to improve model performance.


2019-07 Building Better Deep Learning Requires New Approaches Not Just Bigger Data

In its rush to solve all the world’s problems through deep learning, Silicon Valley is increasingly embracing the idea of AI as a universal solver that can be rapidly adapted to any problem in any domain simply by taking a stock algorithm and feeding it relevant training data. The problem with this assumption is that today’s deep learning systems are little more than correlative pattern extractors that search large datasets for basic patterns and encode them into software. While impressive compared to the standards of previous eras, these systems are still extraordinarily limited, capable only of identifying simplistic correlations rather than actually semantically understanding their problem domain. In turn, the hand-coded era’s focus on domain expertise, ethnographic codification and deeply understanding a problem domain has given way to parachute programming in which deep learning specialists take an off-the-shelf algorithm, shove in a pile of training data, dump out the resulting model and move on to the next problem. Truly advancing the state of deep learning and way in which companies make use of it will require a return to the previous era’s focus on understanding problems rather than merely churning canned models off assembly lines.


2019-05-21 Dealing with the Lack of Data in Machine Learning

In many projects I carried out, companies, despite having fantastic AI business ideas, display a tendency to slowly become frustrated when they realize that they do not have enough data… However, solutions do exist! The purpose of this article is to briefly introduce you to some of them (the ones that are proven effective in…