a OCT: optical coherence tomography.
b CNN: convolutional neural network.
c MRI: magnetic resonance imaging.
d WSI: whole slide image.
e CAE: convolutional autoencoder.
f ResNet: residual networks.
g CT: computed tomography.
h DTI: diffusion tensor imaging.
i mCNN: multicolumn convolutional neural network.
j FCNN: fully convolutional neural network.
k SAE: stacked autoencoder.
l CAD: coronary artery disease.
m SWE: shear wave elastography.
n MIL: multiple instance learning.
o FFNN: feedforward neural network.
p MR: magnetic resonance.
q GAN: generative adversarial network.
r SMILES: simplified molecular input line-entry system.
s RNN: recurrent neural network.
t GRU: gated recurrent unit.
u LSTM: long short-term memory.
v AE: autoencoder.
w AAE: adversarial autoencoder.
x NLP: natural language processing.
y BLSTM: bidirectional long short-term memory.
In these studies, researchers applied or developed deep learning architectures mainly for the following purposes: image analysis, especially for diagnostic purposes, including the classification or prediction of diseases or survival, and the detection, localization, or segmentation of certain areas or abnormalities. These 3 tasks, which aim to identify the location of an object of interest, are different in that detection involves a single reference point, whereas localization involves an area identified through a bounding box, saliency map, or heatmap, segmentation involves a precise area with clear outlines identified through pixel-wise analysis. Meanwhile, in some studies, models for image analysis unrelated to diagnosis were proposed, such as classifying or segmenting cells in microscopic images and tracking moving animals in videos through pose estimation. Another major objective involved image processing for reconstructing or registering medical images. This included enhancing low-resolution images to high resolution, reconstructing images with different modalities or synthesized targets, reducing artifacts, dealiasing, and aligning medical images.
Meanwhile, several researchers used deep learning architectures to analyze molecules, proteins, and genomes for various purposes. These included drug design or discovery, specifically for generating novel molecular structures through sequence analysis and for predicting binding affinities through image analysis of complexes; understanding protein structure through image analysis of contact matrix; and predicting phenotypes, cancer survival, drug synergies, and genomic variant effects from genes or genomes. Finally, in some studies, deep learning was applied to the diagnostic classification of sequential data, including electrocardiogram or polysomnogram signals and electronic health records. In summary, in the reviewed literature, we identified a predominant focus on applying or developing deep learning models for image analysis regarding localization or diagnosis and image processing, with a few studies focusing on protein or genome analysis.
Regarding the main architectures, most of them were predominantly CNNs and based on ≥1 CNN architecture such as a fully CNN (FCNN) and its variants, including U-net; residual neural network (ResNet) and its variants; GoogLeNet (Inception v1) or Inception and VGGNet and its variants; and other architectures. Meanwhile, a few researchers based their models on feedforward neural networks that were not CNNs, including autoencoders (AEs) such as convolutional AE and stacked AE. Others adapted RNNs, including (bidirectional) long short-term memory and gated recurrent unit. Furthermore, models that combined RNNs or AEs with CNNs were also proposed.
Content analysis of the reviewed literature showed that different deep learning architectures were used for different research tasks. Models for classification or prediction tasks using images were predominantly CNN based, with most being ResNet and GoogLeNet or Inception. ResNet with shortcut connections [ 129 ] and GoogLeNet or Inception with 1×1 convolutions, factorized convolutions, and regularizations [ 130 , 131 ] allow networks of increased depth and width by solving problems such as vanishing gradients and computational costs. These mostly analyzed medical images from magnetic resonance imaging or computed tomography, with cancer-related images often used as input data for diagnostic classification, in addition to image-like representations of protein complexes. Meanwhile, when applying these tasks to data other than images, such as genomic or gene expression profiles and protein sequence matrices, researchers used feedforward neural networks, including AEs, that enabled semi- or unsupervised learning and dimensionality reduction.
Image analysis for segmentation and image processing were achieved through CNN-based architectures as well, with most of them being FCNNs, especially U-net. FCNNs produce an input-sized pixel-wise prediction by replacing the last fully connected layers to convolution layers, making them advantageous for the abovementioned tasks [ 132 ], and U-net enhances these performances through long skip connections that concatenate feature maps from the encoder path to the decoder path [ 133 ]. In particular, for medical image processing tasks, a few researchers combined FCNNs (U-net) with other CNNs by adopting the generative adversarial network structure, which generates new instances that mimic the real data through an adversarial process between the generator and discriminator [ 134 ]. We found that images of the brain were often used as input data for these studies.
On the other hand, RNNs were applied to sequence analysis of the string representation of molecules (simplified molecular input line-entry system) and pattern analysis of sequential data such as signals. A few of these models, especially those generating novel molecular structures, combined RNNs with CNNs by adopting generative adversarial networks, including adversarial AE. In summary, the findings showed that the current deep learning models were predominantly CNN based, with most of them focusing on analyzing medical image data and different architectures that are preferred for the specific tasks.
Among these studies, Table 3 shows, in detail, the objectives and the proposed methods of the 35 studies with novel model development.
Content analysis of the top 35 records in the development category.
Number | Development objectives | Methods (proposed model) |
D1 | Segment brain anatomical structures in 3D MRI | Voxelwise Residual Network: trained through residual learning of volumetric feature representation and integrated with contextual information of different modalities and levels |
D2 | Estimate poses to track body parts in various animal behaviors | DeeperCut’s subset DeepLabCut: network fine-tuned on labeled body parts, with deconvolutional layers producing spatial probability densities to predict locations |
D3 | Predict isocitrate dehydrogenase 1 mutation in low-grade glioma with MRI radiomics analysis | Deep learning–based radiomics: segment tumor regions and directly extract radiomics image features from the last convolutional layer, which is encoded for feature selection and prediction |
D4 | Predict protein-ligand binding affinities represented by 3D descriptors | KDEEP: 3D network to predict binding affinity using voxel representation of protein-ligand complex with assigned property according to its atom type |
D5 | Predict phenotype from genotype through the biological hierarchy of cellular subsystems | DCell: visible neural network with structure following cellular subsystem hierarchy to predict cell growth phenotype and genetic interaction from genotype |
D6 | Classify and localize thoracic diseases in chest radiographs | DenseNet-based CheXNeXt: networks trained for each pathology to predict its presence and ensemble and localize indicative parts using class activation mappings |
D7 | Multi-classification of breast cancer from histopathological images | CSDCNN : trained through end-to-end learning of hierarchical feature representation and optimized feature space distance between breast cancer classes |
D8 | Interactive segmentation of 2D and 3D medical images fine-tuned on a specific image | Bounding box and image-specific fine-tuning–based segmentation: trained for interactive image segmentation using bounding box and fine-tuned for specific image with or without scribble and weighted loss function |
D9 | Facial image analysis for identifying phenotypes of genetic syndromes | DeepGestalt: preprocessed for face detection and multiple regions and extracts phenotype to predict syndromes per region and aggregate probabilities for classification |
D10 | Predict cancer outcomes with genomic profiles through survival models optimization | SurvivalNet: deep survival model with high-dimensional genomic input and Bayesian hyperparameter optimization, interpreted using risk backpropagation |
D11 | Predict synergy effect of novel drug combinations for cancer treatment | DeepSynergy: predicts drug synergy value using cancer cell line gene expressions and chemical descriptors, which are normalized and combined through conic layers |
D12 | Classify liver fibrosis stages in chronic hepatitis B using radiomics of SWE | DLRE : predict the probability of liver fibrosis stages with quantitative radiomics approach through automatic feature extraction from SWE images |
D13 | Predict protein residue contact map at pixel level with protein features | RaptorX-Contact: combined networks to learn contact occurrence patterns from sequential and pairwise protein features to predict contacts simultaneously at pixel level |
D14 | Segment liver and tumor in abdominal CT scans | Hybrid Densely connected U-net: 2D and 3D networks to extract intra- and interslice features with volumetric contexts, optimized through hybrid feature fusion layer |
D15 | Reconstruct compressed sensing MRI to dealiased image | DAGAN : conditional GAN stabilized by refinement learning, with the content loss combined adversarial loss incorporating frequency domain data |
D16 | Reconstruct sparse localization microscopy to superresolution image | Artificial Neural Network Accelerated–Photoactivated Localization Microscopy: trained with superresolution PALM as the target, compares reconstructed and target with loss functions containing conditional GAN |
D17 | Generate novel chemical compound design with desired properties | Reinforcement Learning for Structural Evolution: generate chemically feasible molecule as strings and predict its property, which is integrated with reinforcement learning to bias the design |
D18 | Reduce metal artifacts in reconstructed x-ray CT images | CNN -based Metal Artifact Reduction: trained on images processed by other Metal Artifact Reduction methods and generates prior images through tissue processing and replaces metal-affected projections |
D19 | Predict species to identify anthrax spores in single cell holographic images | HoloConvNet: trained with raw holographic images to directly recognize interspecies difference through representation learning using error backpropagation |
D20 | Classify and detect malignant pulmonary nodules in chest radiographs | Deep learning–based automatic detection: predict the probability of nodules per radiograph for classification and detect nodule location per nodule from activation value |
D21 | Predict tissue-specific gene expression and genomic variant effects on the expression | ExPecto: predict regulatory features from sequences and transform to spatial features and use linear models to predict tissue-specific expression and variant effects |
D22 | Reconstruct MRF to obtain tissue parameter maps | Deep reconstruction network: trained with a sparse dictionary that maps magnitude image to quantitative tissue parameter values for MRF reconstruction |
D23 | Generate high-resolution Hi-C interaction matrix of chromosomes from a low-resolution matrix | HiCPlus: predict high-resolution matrix through mapping regional interaction features of low-resolution to high-resolution submatrices using neighboring regions |
D24 | Estimate poses to track body parts of freely moving animals | LEAP : videos preprocessed for egocentric alignment and body parts labeled using GUI and predicts each location by confidence maps with probability distributions |
D25 | Jointly segment optic disc and cup in fundus images for glaucoma screening | M-Net: multi-scale network for generating multi-label segmentation prediction maps of disc and cup regions using polar transformation |
D26 | Reconstruct limited-view PAT to high-resolution 3D images | Deep gradient descent: learned iterative image reconstruction, incorporated with gradient information of the data fit separately computed from training |
D27 | Predict classifications of and localize knee injuries from MRI | MRNet: networks trained for each diagnosis according to a series to predict its presence and combine probabilities for classification using logistic regression |
D28 | Predict binding affinities between 3D structures of protein-ligand complexes | Pafnucy: structure-based prediction using 3D grid representation of molecular complexes with different orientations as having same atom types |
D29 | Classify electrocardiogram signals based on wavelet transform | Deep bidirectional LSTM network–based wavelet sequences: generate decomposed frequency subbands of electrocardiogram signal as sequences by wavelet-based layer and use as input for classification |
D30 | Generate novel small molecule structures with possible biological activity | Reinforced Adversarial Neural Computer: combined with GAN and reinforcement learning, generates sequences matching the key feature distributions in the training molecule data |
D31 | Detect and localize breast cancer metastasis in digitized lymph nodes slides | LYmph Node Assistant: predict the likelihood of tumor in tissue area and generate a heat map for slides identifying likely areas |
D32 | Transform low-resolution thick slice knee MRI to high-resolution thin slices | DeepResolve: trained to compute residual images, which are added to low-resolution images to generate their high-resolution images |
D33 | Reconstruct sparse-view CT to suppress artifact and preserve feature | Learned Experts’ Assessment–Based Reconstruction Network: iterative reconstruction using previous compressive sensing methods, with fields of expert-applied regularization terms learned iteration dependently |
D34 | Unsupervised affine and deformable aligning of medical images | Deep Learning Image Registration: multistage registration network and unsupervised training to predict transformation parameters using image similarity and create warped moving images |
D35 | Classify subcellular localization patterns of proteins in microscopy images | Localization Cellular Annotation Tool: predict localization per cell for image-based classification of multi-localizing proteins, combined with gamer annotations for transfer learning |
a MRI: magnetic resonance imaging.
b CSDCNN: class structure-based deep convolutional neural network.
c SWE: shear wave elastography.
d DLRE: deep learning radiomics of elastography.
e CT: computed tomography.
f DAGAN: Dealiasing Generative Adversarial Networks.
g GAN: generative adversarial network.
h PALM: photoactivated localization microscopy.
i CNN: convolutional neural network.
j MRF: magnetic resonance fingerprinting.
k LEAP: LEAP Estimates Animal Pose.
l GUI: graphical user interface.
m PAT: photoacoustic tomography.
n LSTM: long short-term memory.
In quite a few of the reviewed studies, the black box problem of deep learning was partly addressed, as researchers implemented various methods to improve model interpretability. To understand the prediction results of image analysis models, most used one of the following two techniques to visualize the important regions: (1) activation-based heatmaps [ 45 , 54 , 65 , 70 ], especially class activation maps [ 57 , 61 , 77 , 92 ], and saliency maps [ 59 ] and (2) occlusion testing [ 39 , 75 , 82 , 94 ]. For models analyzing data other than images, there were no generally accepted techniques for model interpretation, and researchers suggested some methods, including adopting an interpretable hierarchical structure such as the cellular subsystem [ 122 ] or anatomical division [ 125 ], using backpropagation [ 123 ], observing gate activations of cells in the neural network [ 114 ], or investigating how corrupted input data affect the prediction and how identical predictions are made for different inputs [ 93 ]. As such, various methods were found to be used to tackle this well-known limitation of deep learning.
On average, each examined deep learning study with at least one PubMed indexed citation (429/978, 43.9%) had 25.8 (SD 20.0) citations. These cited references comprised 9373 unique records that were cited 1.27 times on average (SD 2.16). Excluding the ones that were unindexed in the WoS Core Collection (8618/9373, 8.06% of the unique records), an average of 1.77 (SD 1.07) categories were assigned to a record. The top ten WoS categories, which were assigned to the greatest number of total cited references, pertained to the following three major groups: (1) biomedicine ( Radiology, Nuclear Medicine, and Medical Imaging : 2025/11,033, 18.35%; Biochemical Research Methods : 1118/11,033, 10.13%; Mathematical and Computational Biology : 1066/11,033, 9.66%; Biochemistry and Molecular Biology : 1043/11,033, 9.45%; Engineering, Biomedical : 981/11,033, 8.89%; Biotechnology and Applied Microbiology : 916/11,033, 8.3%; Neurosciences : 844/11,033, 7.65%), (2) computer science and engineering ( Computer Science, Interdisciplinary Applications : 1041/11,033, 9.44%; Engineering, Electrical and Electronic : 645/11,033, 5.85%), and (3) Multidisciplinary Sciences (with 1411/11,033, 12.79% records).
To understand the intellectual structure of how knowledge is transferred among different areas of study through citations, we visualized the citation network of WoS subject categories. In the directed citation network shown in Figure 5 , the edges were directed clockwise with the source nodes as the WoS categories of the deep learning studies we examined and the target nodes as the WoS categories of the cited references from which knowledge was obtained. To enhance legibility, we filtered out categories with <100 weighted degrees, excluding self-loops, to form a network of 20 nodes (20/158, 12.7% of the total) and 59 edges (59/2380, 2.48% of the total). In the figure, the node color and size are proportional to the PageRank score (probability 0.85; ε=0.001; Figure 5 A) and weighted-out degree ( Figure 5 B), and the edge size and color are proportional to the link strength. PageRank considers not only the quantity but also the quality of incoming edges, identifying important exporters for knowledge diffusion based on how often and by which fields a node is cited. On the other hand, the weighted outdegree measures outgoing edges and identifies major knowledge importers that frequently cite other fields.
Citation network of the Web of Science subject categories assigned to the reviewed publications and their cited references according to (A) PageRank and (B) weighted outdegree (number of nodes=20; number of edges=59).
As depicted in Figure 5 A, categories with high PageRank scores mostly coincided with the frequently cited fields identified above and were grouped into two communities through modularity (upper half and lower half). The upper half region centered on Radiology, Nuclear Medicine, and Medical Imaging , which had the highest PageRank score (0.191) and proved to be a field with a significant influence on deep learning studies in biomedicine. Meanwhile, important knowledge exporters to this field included Engineering, Biomedical (0.134); Engineering, Electrical and Electronic (0.110); and Computer Science, Interdisciplinary Applications (0.091). The lower half region mainly comprised categories with comparable PageRank scores in which knowledge was frequently exchanged between one another, including Biochemical Research Methods (0.053), Multidisciplinary Sciences (0.053), Biochemistry and Molecular Biology (0.052), Biotechnology and Applied Microbiology (0.050), and Mathematical and Computational Biology (0.048). Specifically, in Figure 5 B, Mathematical and Computational Biology (1992), Biotechnology and Applied Microbiology (1836), and Biochemical Research Methods (1807) were identified as major knowledge importers with the highest weighted outdegrees, whereas Biochemistry and Molecular Biology (344) had a relatively low weighted outdegree, indicating their role as a source of knowledge for these fields.
We analyzed the 10 most frequently cited studies to gain an in-depth understanding of the most influential works and assigned these papers to one of the three categories: review, application, or development. Review articles provided comprehensive overviews of the development and applications of deep learning [ 1 , 3 ], with 1 focusing on applications to medical image analysis [ 4 ]. We summarize the 7 application (denoted by A ) or development (denoted by D ) studies in Table 4 .
Content analysis matrix of the highly cited references in the application or development category.
Category | Citation count, n | Research topic: task type | Objectives | Methods (deep learning architectures) |
A1 [ ] | 53 | Diagnostic image analysis: classification | Apply CNN to classifying skin lesions from clinical images | Inception version 3 fine-tuned end to end with images; tested against dermatologists on 2 binary classifications |
A2 [ ] | 51 | Diagnostic image analysis: classification | Apply CNN to detecting referrable diabetic retinopathy on retinal fundus images | Inception version 3 trained and validated using 2 data sets of images graded by ophthalmologists |
D1 [ ] | 34 | Computer science | Develop a new gradient-based RNN to solve error backflow problems | LSTM achieved constant error flow through memory cells regulated by gate units; tested numerous times against other methods |
D2 [ ] | 33 | Sequence analysis: binding (variant effects) prediction | Propose a predictive model for sequence specificities of DNA- and RNA-binding proteins | CNN-based DeepBind trained fully automatically through parallel implementation to predict and visualize binding specificities and variation effects |
A3 [ ] | 27 | Diagnostic image analysis: classification | Evaluate factors of using CNNs for thoracoabdominal lymph node detection and interstitial lung disease classification | Compare performances of AlexNet, CifarNet, and GoogLeNet trained with transfer learning and different data set characteristics |
D3 [ ] | 23 | Sequence analysis: chromatin profiles (variant effects) prediction | Propose a model for predicting noncoding variant effects from genomic sequence | CNN-based DeepSEA trained for chromatin profile prediction to estimate variant effects with single nucleotide sensitivity and prioritize functional variants |
A4 [ ] | 23 | Diagnostic image analysis: classification | Evaluate CNNs for tuberculosis detection on chest radiographs | Compare performances of AlexNet and GoogLeNet and ensemble of 2 trained with transfer learning, augmented data set, and radiologist-augmented approach |
a CNN: convolutional neural network.
b RNN: recurrent neural network.
c LSTM: long short-term memory.
In these studies, excluding the study by Hochreiter and Schmidhuber [ 135 ], whose research topic pertained to computer science, deep learning was used for diagnostic image analysis of various areas [ 12 - 14 , 136 ] and for sequence analysis of proteins [ 21 ] or genomes [ 22 ]. The main architectures implemented to achieve the different research objectives mostly comprised CNNs [ 12 - 14 , 136 ] or CNN-based novel models [ 21 , 22 ] and RNNs [ 135 ]. The findings indicated that these deep neural networks either outperformed previous methods or achieved a performance comparable with that of human experts.
With the increase in biomedical research using deep learning techniques, we aimed to gain a quantitative and qualitative understanding of the scientific domain, as reflected in the published literature. For this purpose, we conducted a scientometric analysis of deep learning studies in biomedicine.
Through the metadata and content analyses of bibliographic records, we identified the current leading fields and research topics, the most prominent being radiology and medical imaging. Other biomedical fields that have led this domain included biomedical engineering, mathematical and computational biology, and biochemical research methods. As part of interdisciplinary research, computer science and electrical engineering were important fields as well. The major research topics that were studied included computer-assisted image interpretation and diagnosis (which involved localizing or segmenting certain areas for classifying or predicting diseases), image processing such as medical image reconstruction or registration, and sequence analysis of proteins or RNA to understand protein structure and discover or design drugs. These topics were particularly prevalent in their application to neoplasms.
Furthermore, although deep learning techniques that had been proposed for these themes were predominantly CNN based, different architectures are preferred for different research tasks. The findings showed that CNN-based models mostly focused on analyzing medical image data, with RNN architectures for sequential data analysis and AEs for unsupervised dimensionality reduction yet to be actively explored. Other deep learning methods, such as deep belief networks [ 137 , 138 ], deep Q network [ 139 ], and dictionary learning [ 140 ], have also been applied to biomedical research but were excluded from the content analysis because of low citation count. As deep learning is a rapidly evolving field, future biomedical researchers should pay attention to the emerging trends and keep aware of state-of-the-art models for enhanced performance, such as transformer-based models, including bidirectional encoder representations from transformers for NLP [ 141 ]; wav2vec for speech recognition [ 142 ]; and the Swin transformer for computer vision tasks of image classification, segmentation, and object detection [ 143 ].
The findings from the analysis of the cited references revealed patterns of knowledge diffusion. In the analysis, radiology and medical imaging appeared to be the most significant knowledge source and an important field in the knowledge diffusion network. Relatedly, we identified knowledge exporters to this field, including biomedical engineering, electrical engineering, and computer science, as important, despite their relatively low citation counts. Furthermore, citation patterns revealed clique-like relationships among the four fields—biochemical research methods, biochemistry and molecular biology, biotechnology and applied microbiology, and mathematical and computational biology—with each being a source of knowledge and diffusion for the others.
Beyond knowledge diffusion, knowledge integration was also encouraged through collaboration among authors from different organizations and academic disciplines. Coauthorship analysis revealed active research collaboration between universities and hospitals and between hospitals and companies. Separately, we identified an engineering-oriented cluster and biomedicine-oriented clusters of disciplines, among which we observed a range of disciplinary collaborations, with the most prominent 2 between radiology and medical imaging and computer science and electrical engineering, which were the 3 disciplines that were most involved in publishing and collaboration. Meanwhile, pathology and public health showed a high collaborative research to publications ratio, whereas computational biology showed a low collaborative ratio.
This study has the following limitations that may have affected data analysis and interpretation. First, focusing only on published studies may have underrepresented the field. Second, publication data were only retrieved from PubMed; although PubMed is one of the largest databases for biomedical literature, other databases such as DataBase systems and Logic Programming may also include relevant studies. Third, the use of PubMed limited our data to biomedical journals and proceedings. Given that deep learning is an active research area in computer science, computer science conference articles are valuable sources of data that were not considered in this study. Finally, our current data retrieval strategy involved searching deep learning as the major MeSH term, which increased precision but may have omitted relevant studies that were not explicitly tagged as deep learning . We plan to expand our scope in future work to consider other bibliographic databases and search terms as well.
In this study, we investigated the landscape of deep learning research in biomedicine and identified major research topics, influential works, knowledge diffusion, and research collaboration through scientometric analyses. The results showed a predominant focus on research applying deep learning techniques, especially CNNs, to radiology and medical imaging and confirmed the interdisciplinary nature of this domain, especially between engineering and biomedical fields. However, diverse biomedical applications of deep learning in the fields of genetics and genomics, medical informatics focusing on text or speech data, and signal processing of various activities (eg, brain, heart, and human) will further boost the contribution of deep learning in addressing biomedical research problems. As such, although deep learning research in biomedicine has been successful, we believe that there is a need for further exploration, and we expect the results of this study to help researchers and communities better align their present and future work.
AE | autoencoder |
CNN | convolutional neural network |
FCNN | fully convolutional neural network |
MeSH | Medical Subject Heading |
NLP | natural language processing |
ResNet | residual neural network |
RNN | recurrent neural network |
WoS | Web of Science |
Authors' Contributions: SN and YZ designed the study. SN, DK, and WJ analyzed the data. SN took the lead in the writing of the manuscript. YZ supervised and implemented the study. All authors contributed to critical edits and approved the final manuscript.
Conflicts of Interest: None declared.
Trends examined with machine-learning analysis, which could help inform public health policies, by plos one and the university at buffalo.
Release Date: December 20, 2021
Yingjie Hu, assistant professor of geography in the UB College of Arts and Sciences.
Brian Quigley, research assistant professor of medicine in the Jacobs School of Medicine and Biomedical Sciences at UB and the UB Clinical and Research Institute on Addictions.
Dane Taylor, assistant professor of mathematics in the UB College of Arts and Sciences.
BUFFALO, N.Y. — An analysis of data from 16 U.S. states suggests that the first few months of the COVID-19 pandemic saw increases in wine and spirit sales, accompanied by notable changes in the relationship between alcohol sales and people’s visits to businesses that sell alcohol.
University at Buffalo researchers Yingjie Hu , Brian M. Quigley and Dane Taylor present these findings in the open-access journal PLOS ONE on Dec. 17 The team notes that trends varied by state.
After U.S. states implemented stay-at-home orders and other restrictions to reduce the spread of COVID-19 in March 2020, anecdotes suggested an increase in alcohol sales. However, data-driven investigations into whether alcohol sales and use did indeed increase have produced mixed results.
To help clarify the potential impact of COVID-19 lockdowns and other social distancing measures on the dynamics of alcohol sales, Hu and colleagues conducted an analysis of relevant data from 16 U.S. states, comparing the period from March to June 2020 to the same period in 2018 and 2019.
“Anonymized human mobility data and geospatial analysis help us understand how people’s visiting behavior to alcohol outlets changed during the stay-at-home period of COVID-19, and how such behavior change varied across different geographic regions,” says Hu, PhD, an assistant professor of geography in the UB College of Arts and Sciences.
“Understanding how alcohol purchase behavior is changed by events such as COVID is important because heavy alcohol use is known to be associated with numerous social problems, especially within the home,” says Quigley, PhD, research assistant professor of medicine in the Jacobs School of Medicine and Biomedical Sciences at UB and the UB Clinical and Research Institute on Addictions.
Using a variety of analytical techniques, including machine-learning methods, they evaluated monthly alcohol sales data reported by the U.S. National Institute on Alcohol Abuse and Alcoholism (NIAAA), as well as anonymized mobility data from over 45 million smart mobile devices (mostly smartphones) indicating people’s visits to businesses where alcohol is sold. (The NIAAA data used in the study focuses on monthly sales of alcohol for 14 U.S. states. It includes sales of spirits, wine and beer, but not all states report data in all of those categories. The anonymized mobility data included information for these 14 states, plus two others.)
The analysis found that overall, sales of spirits and wine increased in the early months of the pandemic — by as much as 20-40% in some states in certain months — while beer sales declined overall compared to the same period during recent years. Meanwhile, people’s visits to bars and pubs declined, but visits to liquor stores increased.
Dynamics varied significantly across states. For example, while beer sales decreased in most states between March and June 2020 compared with the same months in recent years, they increased in Kansas, Arkansas and Texas. Meanwhile, Texas, Kentucky and Virginia showed sustained increases in their sales of both spirits and wine, which the authors suggest “can be alarming signals for problematic alcohol use.”
“If data can provide information about geographic areas in which alcohol use increases during certain types of events such as during severe weather, high unemployment, or events such as the COVID pandemic, this information can be useful to help prepare law enforcement, medical professionals and substance use disorder treatment providers to address alcohol-related issues associated with such times,” Quigley says.
Machine-learning assessments in the study point to a significant shift in the relationship between alcohol sales data and visits to various alcohol outlets. More research will be necessary to understand how people’s behaviors changed, but these findings suggest the possibility that some states may have seen an increase in online alcohol purchases or panic buying of spirits and wine.
The research team notes that the study has some limitations: For example, many states were not included in the NIAAA dataset, and the human mobility data was not able to capture alcohol sales at places such as grocery stores, where sales of alcohol are mixed with sales of other items. Nevertheless, these results provide insights into the potential effects of lockdown policies on alcohol use and could inform future public health policies to address alcohol-related social issues, the researchers say.
From a research methodological perspective, Taylor, PhD, UB assistant professor of mathematics, notes, “Interfacing new data sources such as anonymized human mobility data with public health challenges that are difficult or expensive to directly measure reveals new methodological challenges for applied machine learning research.”
Charlotte Hsu is a former staff writer in University Communications. To contact UB's media relations staff, email [email protected] or visit our list of current university media contacts .
VIDEO
COMMENTS
Statistical analysis according to design features and objectives is essential to ensure the validity and reliability of the study findings and conclusions in biomedical research. Heterogeneity in reporting study design elements and conducting statistical analyses is often observed for the same study design and study objective in medical ...
Basics of Biostatistics. Application of statistical methods in biomedical research began more than 150 years ago. One of the early pioneers, Florence Nightingale, the icon of nursing, worked during the Crimean war of the 1850s to improve the methods of constructing mortality tables. The conclusions from her tables helped to change the practices ...
The Correlation Coefficient of Regression Analysis. Another common research objective is to measure how much changes in one variable explain changes in another. Generally, scatter plots are used to illustrate this relation. ... This was a very non-mathematical overview of the everyday statistics used in biomedical research. The maths behind ...
In applied biomedical research, methods and protocols are indispensable for unravelling the workings of biomedically relevant biological systems (molecular, cellular, and at the organ and whole ...
Basic biomedical research, which addresses mechanisms that underlie the formation and function of living organisms, ranging from the study of single molecules to complex integrated functions of humans, contributes profoundly to our knowledge of how disease, trauma, or genetic defects alter normal physiological and behavioral processes. Recent advances in molecular biology techniques and ...
The book is aimed at exposing biomedical researchers to modern biostatistical methods and statistical graphics, highlighting those methods that make fewer assumptions, including nonparametric statistics and robust statistical measures. In addition to covering traditional estimation and inferential techniques, the course contrasts those with the ...
The goal of the topic group TG9 "High-dimensional data" (HDD) of the STRATOS (STRengthening Analytical Thinking for Observational Studies) [] initiative is to provide guidance for planning, conducting, analyzing, and reporting studies involving high-dimensional biomedical data.The increasing availability and use of "big" data in biomedical research, characterized by "large n ...
This book consists of four parts with 32 chapters adapted for four short courses, from the basic to the advanced levels of medical statistics (biostatistics), ideal for biomedical students. Part 1 is a compulsory course of Basic Statistics with descriptive statistics, parameter estimation and hypothesis test, simple correlation and regression.
Abstract. Statistics is the science of quantitative methods that guide experimental data collection, interpretation, and presentation. Statistics has a central role in the biomedical sciences, with appropriate statistical practices leading to an enhanced probability of reproducibility and the avoidance of false positives. Statistics is often ...
Statistical analysis is a crucial component of biomedical research, providing insights and guiding decisions. In this blog post, we'll explore five key trends that are currently shaping statistical analysis in the field. So grab a cup of coffee, sit back, and let's dive in! 1. Increased Interest in Bayesian Statistics.
Biomedical Analysis is an international, interdisciplinary, scientific, peer-reviewed Open Access journal. It aims to publish high-quality articles in the field of biomedical engineering, bioanalytical chemistry, biochemistry, genetics, biology, biomaterials, and medicine. Its objects involve the interaction of chemical, biological, and medical ...
We used this atlas as an exploration tool to study the biomedical research landscape, generating hypotheses that we later confirmed using the original high-dimensional data. Using five distinct examples—the emergence of the COVID-19 literature, the evolution of the neuroscience discipline, the uptake of machine learning, the gender imbalance ...
Background: Statistical analysis according to design features and objectives is essential to ensure the validity and reliability of the study findings and conclusions in biomedical research. Heterogeneity in reporting study design elements and conducting statistical analyses is often observed for the same study design and study objective in medical literatures.
The past 200 years have seen rapid advances in western biomedicine. A model arising from western Europe and North America, current biomedical science is largely driven by efforts to prevent or cure diseases. It uses hierarchies of evidence generated from observational and experimental research,1 and is arguably driven by the interests of scientists who hold this underlying philosophy, with ...
Raman spectroscopy and multivariate regression analysis in biomedical research, medical diagnosis, and clinical analysis. ... Recent innovations in the use of Raman spectroscopy for chemical analysis in human specimens are discussed. Applications of Raman spectroscopy in cancer immunotherapy, cancer imaging, and detecting disease biomarkers in ...
Introduction. Rapid advances in analytical technology coupled with widespread access to large amounts of highly detailed, heterogeneous and often public biomedical research data have dramatically increased the difficulties faced by biomedical investigators in acquiring, archiving, annotating, and analyzing data. 1 Recognition of this fact is reflected in a number of large scale initiatives by ...
Most biomedical, health and care research does not adequately account for sex and gender dimensions of health and illness. ... A framework for sex, gender, and diversity analysis in research ...
Biomedical Data Science involves the analysis of large-scale biomedical datasets to understand how living systems function. Our academic and research programs in Biomedical Data Science center on developing new data analysis technologies in order to understand disease mechanisms and provide improved health care at lower costs.
The present Topic aims to cover the latest research trends and achievements of chromatography-mass spectrometry in biomedical, clinical, and pharmacological research by highlighting novel applications and novel approaches in sample treatment and instrumental analysis. Researchers working on all aspects of basic research and applications in ...
Needs Analysis. Three broad data management and analysis themes emerged from the analysis of the interview data within the context of the survey responses: 1) current state of data management and analysis at the laboratory level; 2) anticipated data management and analysis needs; 3) barriers to addressing those needs.
Statistical analysis according to design features and objectives is essential to ensure the validity and reliability of the study findings and conclusions in biomedical research. Heterogeneity in reporting study design elements and conducting statistical analyses is often observed for the same study design and study objective in medical ...
Objectives: A. Identify the current state of data management needs of academic biomedical researchers. B. Explore their anticipated data management and analysis needs. C. Identify barriers to addressing those needs. Design: A multimodal needs analysis was conducted using a combination of an online survey and in-depth one-on-one semi-structured interviews.
Graphical abstractDisplay Omitted. The dynamic development of high-throughput methods, and with them the availability of large and constantly growing data resources, forces the development of new analytical approaches that allow the review of the analyzed processes, taking into account data from various levels of the organization of living organisms.
The term "data preprocessing" is often used in biomedical research involving analysis of HDD, especially in the omics field, to denote certain initial data cleaning and screening steps falling within the more general category of "initial data analysis." Data preprocessing refers to the process of transforming "raw" data, obtained ...
Apply in-depth knowledge of biomedical research processes, including chemical, pre-clinical, and clinical stages, to provide valuable insights and recommendations. Stay updated with the latest advancements in the biomedical field and AI technologies, ensuring the application of cutting-edge solutions. Business Analysis:
In the past few decades, advances in 3D imaging have created new opportunities for reverse genetic screens. Rapidly growing datasets of 3D images of genetic knockouts require high-throughput, automated computational approaches for identifying and characterizing new phenotypes. However, exploratory, discovery-oriented image analysis pipelines used to discover these phenotypes can be difficult ...
Computer Methods in Biomechanics and Biomedical Engineering ... filter banks. Subsequently, we introduce the Pearson-Fisher combinational method along with Discriminant Correlation Analysis (DCA) for joint feature selection and fusion. ... This work was supported by National Key Research and Development Program of China (No.2021ZD0113204 ...
Advances in biomedical research using deep learning techniques have generated a large volume of related literature. However, there is a lack of scientometric studies that provide a bird's-eye view of them. ... Coauthorship analysis revealed active research collaboration between universities and hospitals and between hospitals and companies ...
"Understanding how alcohol purchase behavior is changed by events such as COVID is important because heavy alcohol use is known to be associated with numerous social problems, especially within the home," says Quigley, PhD, research assistant professor of medicine in the Jacobs School of Medicine and Biomedical Sciences at UB and the UB ...