Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Comparing Random-Based and k-Anonymity-Based Algorithms for Graph Anonymization

Published in International Conference on Modeling Decisions for Artificial Intelligence, 2012

Recently, several anonymization algorithms have appeared for privacy preservation on graphs. Some of them are based on randomization techniques and on k-anonymity concepts. We can use both of them to obtain an anonymized graph with a given k-anonymity value. In this paper we compare algorithms based on both techniques in order to obtain an anonymized graph with a desired k-anonymity value. We want to analyze the complexity of these methods to generate anonymized graphs and the quality of the resulting graphs.

Recommended citation: Casas-Roma, J., Herrera-Joancomartí, J., Torra, V. (2012). Comparing Random-Based and k-Anonymity-Based Algorithms for Graph Anonymization. In: Torra, V., Narukawa, Y., López, B., Villaret, M. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2012. Lecture Notes in Computer Science(), vol 7647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34620-0_19
Download Paper

Analyzing the Impact of Edge Modifications on Networks

Published in International Conference on Modeling Decisions for Artificial Intelligence, 2013

Most of recent anonymization algorithms for networks are based on edge modification, i.e., adding and/or deleting edges on a network. But, no one considers the edge’s relevance in order to decide which edges may be removed and which ones must be preserved. Considering edge’s relevance can help us to improve data utility and reduce information loss. In this paper we analyse different measures for quantifying edge’s relevance. Also, we present a new simple metric for edge’s relevance on medium or large networks.

Recommended citation: Casas-Roma, J., Herrera-Joancomartí, J., Torra, V. (2013). Analyzing the Impact of Edge Modifications on Networks. In: Torra, V., Narukawa, Y., Navarro-Arribas, G., Megías, D. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2013. Lecture Notes in Computer Science(), vol 8234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41550-0_26
Download Paper

An algorithm for k-degree anonymity on large networks

Published in ASONAM 13: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2013

In this paper, we consider the problem of anonymization on large networks. There are some anonymization methods for networks, but most of them can not be applied on large networks because of their complexity. We present an algorithm for k-degree anonymity on large networks. Given a network G, we construct a k-degree anonymous network, G, by the minimum number of edge modifications. We devise a simple and efficient algorithm for solving this problem on large networks. Our algorithm uses univariate micro-aggregation to anonymize the degree sequence, and then it modifies the graph structure to meet the k-degree anonymous sequence. We apply our algorithm to a different large real datasets and demonstrate their efficiency and practical utility.

Recommended citation: Jordi Casas-Roma, Jordi Herrera-Joancomartí, and Vicenç Torra. 2013. An algorithm for k-degree anonymity on large networks. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 13). Association for Computing Machinery, New York, NY, USA, 671–675. https://doi.org/10.1145/2492517.2492643
Download Paper

Privacy-Preserving on Graphs Using Randomization and Edge-Relevance

Published in International Conference on Modeling Decisions for Artificial Intelligence, 2014

The problem of anonymization on graphs and the utility of the released data are considered in this paper. Although there are some anonymization methods for graphs, most of them cannot be applied on medium or large networks due to their complexity. Nevertheless, random-based methods are able to work with medium or large networks while fulfilling the desired privacy level. In this paper, we devise a simple and efficient algorithm for randomization on graphs. Our algorithm considers the edge’s relevance, preserving the most important edges of the graph, in order to improve the data utility and reduce the information loss on anonymous data. We apply our algorithm to different real datasets and demonstrate their efficiency and practical utility.

Recommended citation: Casas-Roma, J. (2014). Privacy-Preserving on Graphs Using Randomization and Edge-Relevance. In: Torra, V., Narukawa, Y., Endo, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2014. Lecture Notes in Computer Science(), vol 8825. Springer, Cham. https://doi.org/10.1007/978-3-319-12054-6_18
Download Paper

A Summary of k-Degree Anonymous Methods for Privacy-Preserving on Networks

Published in Studies in Computational Intelligence, 2014

In recent years there has been a significant raise in the use of graph-formatted data. For instance, social and healthcare networks present relationships among users, revealing interesting and useful information for researches and other third-parties. Notice that when someone wants to publicly release this information it is necessary to preserve the privacy of users who appear in these networks. Therefore, it is essential to implement an anonymization process in the data in order to preserve users’ privacy. Anonymization of graph-based data is a problem which has been widely studied last years and several anonymization methods have been developed. In this chapter we summarize some methods for privacy-preserving on networks, focusing on methods based on the k-anonymity model. We also compare the results of some k-degree anonymous methods on our experimental set up, by evaluating the data utility and the information loss on real networks.

Recommended citation: Casas-Roma, J., Herrera-Joancomartí, J., Torra, V. (2015). A Summary of k-Degree Anonymous Methods for Privacy-Preserving on Networks. In: Navarro-Arribas, G., Torra, V. (eds) Advanced Research in Data Privacy. Studies in Computational Intelligence, vol 567. Springer, Cham. https://doi.org/10.1007/978-3-319-09885-2_13
Download Paper

Evolutionary Algorithm for Graph Anonymization

Published in arXiv, 2014

In recent years there has been a significant increase in the use of graphs as a tool for representing information. It is very important to preserve the privacy of users when one wants to publish this information, especially in the case of social graphs. In this case, it is essential to implement an anonymization process in the data in order to preserve users privacy. In this paper we present an algorithm for graph anonymization, called Evolutionary Algorithm for Graph Anonymization (EAGA), based on edge modifications to preserve the k-anonymity model.

Recommended citation: Casas-Roma, J., Herrera-Joancomartí, J., & Torra, V. (2013). Evolutionary Algorithm for Graph Anonymization (Version 2). arXiv. https://doi.org/10.48550/ARXIV.1310.0229
Download Paper

Anonymizing Graphs: Measuring Quality for Clustering

Published in Knowledge and Information Systems, 2014

Anonymization of graph-based data is a problem, which has been widely studied last years, and several anonymization methods have been developed. Information loss measures have been carried out to evaluate the noise introduced in the anonymized data. Generic information loss measures ignore the intended anonymized data use. When data has to be released to third-parties, and there is no control on what kind of analyses users could do, these measures are the standard ones. In this paper we study different generic information loss measures for graphs comparing such measures to the cluster-specific ones. We want to evaluate whether the generic information loss measures are indicative of the usefulness of the data for subsequent data mining processes.

Recommended citation: Casas-Roma, J., Herrera-Joancomartí, J. & Torra, V. Anonymizing graphs: measuring quality for clustering. Knowl Inf Syst 44, 507–528 (2015). https://doi.org/10.1007/s10115-014-0774-7
Download Paper

An Evaluation of Edge Modification Techniques for Privacy-Preserving on Graphs

Published in International Conference on Modeling Decisions for Artificial Intelligence, 2015

Noise is added by privacy-preserving methods or anonymization processes to prevent adversaries from re-identifying users in anonymous networks. The noise introduced by the anonymization steps may also affect the data, reducing its utility for subsequent data mining processes. Graph modification approaches are one of the most used and well-known methods to protect the privacy of the data. These methods converts the data by edges or vertices modifications before releasing the perturbed data. In this paper we want to analyse the edge modification techniques found in the literature covering this topic, and then empirically evaluate the information loss introduced by each of these methods. We want to point out how these methods affect the main properties and characteristics of the network, since it will help us to choose the best one to achieve a desired privacy level while preserving data utility.

Recommended citation: Casas-Roma, J. (2015). An Evaluation of Edge Modification Techniques for Privacy-Preserving on Graphs. In: Torra, V., Narukawa, T. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2015. Lecture Notes in Computer Science(), vol 9321. Springer, Cham. https://doi.org/10.1007/978-3-319-23240-9_15
Download Paper

Community-Preserving Generalization of Social Networks

Published in ASONAM 15: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 2015

In this paper, we tackle the problem of graph generalization in the context of privacy-preserving social network mining. By grouping together nodes that are not only similar but that also belong to the same k-shells, we better preserve the community structure of the graph, its utility in case of clustering-related applications, while still achieving some privacy level through the concept of graph generalization. We conduct empirical evaluations of our approach on synthetic and real social network data, demonstrating its utility and practical application.

Recommended citation: Jordi Casas-Roma and François Rousseau. 2015. Community-Preserving Generalization of Social Networks. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 (ASONAM 15). Association for Computing Machinery, New York, NY, USA, 1465–1472. https://doi.org/10.1145/2808797.2808854
Download Paper

k-Degree anonymity and edge selection: improving data utility in large networks

Published in Knowledge and Information Systems, 2016

The problem of anonymization in large networks and the utility of released data are considered in this paper. Although there are some anonymization methods for networks, most of them cannot be applied in large networks because of their complexity. In this paper, we devise a simple and efficient algorithm for k-degree anonymity in large networks. Our algorithm constructs a k-degree anonymous network by the minimum number of edge modifications. We compare our algorithm with other well-known k-degree anonymous algorithms and demonstrate that information loss in real networks is lowered. Moreover, we consider the edge relevance in order to improve the data utility on anonymized networks. By considering the neighbourhood centrality score of each edge, we preserve the most important edges of the network, reducing the information loss and increasing the data utility. An evaluation of clustering processes is performed on our algorithm, proving that edge neighbourhood centrality increases data utility. Lastly, we apply our algorithm to different large real datasets and demonstrate their efficiency and practical utility.

Recommended citation: Casas-Roma, J., Herrera-Joancomartí, J. & Torra, V. k-Degree anonymity and edge selection: improving data utility in large networks. Knowl Inf Syst 50, 447–474 (2017). https://doi.org/10.1007/s10115-016-0947-7
Download Paper

A survey of graph-modification techniques for privacy-preserving on networks

Published in Artificial Intelligence Review, 2016

Recently, a huge amount of social networks have been made publicly available. In parallel, several definitions and methods have been proposed to protect users’ privacy when publicly releasing these data. Some of them were picked out from relational dataset anonymization techniques, which are riper than network anonymization techniques. In this paper we summarize privacy-preserving techniques, focusing on graph-modification methods which alter graph’s structure and release the entire anonymous network. These methods allow researchers and third-parties to apply all graph-mining processes on anonymous data, from local to global knowledge extraction.

Recommended citation: Casas-Roma, J., Herrera-Joancomartí, J. & Torra, V. A survey of graph-modification techniques for privacy-preserving on networks. Artif Intell Rev 47, 341–366 (2017). https://doi.org/10.1007/s10462-016-9484-8
Download Paper

A Preliminary Study about the Analytic Maturity of Educational Organizations

Published in 2016 International Conference on Intelligent Networking and Collaborative Systems (INCoS), 2016

During the last years, universities, and other educational organizations, have increased the use of analytics in order to improve the efficiency of their internal management and to increase their teaching and research quality. In this paper we present the results of a survey we conducted to determine the analytical level of educational organizations and how it differs from the analytical level of non-educational organizations. In order to do so, we have used the DELTA model to measure the analytical maturity of over 100 organizations, from which around 15% are educational. The aim of the paper is to provide background information on the current state of analytics and further steps.

Recommended citation: I. Guitart, J. Conesa and J. Casas, "A Preliminary Study about the Analytic Maturity of Educational Organizations," 2016 International Conference on Intelligent Networking and Collaborative Systems (INCoS), Ostrava, Czech Republic, 2016, pp. 345-350, https://doi.org/10.1109/INCoS.2016.53
Download Paper

Community-preserving anonymization of graphs

Published in Knowledge and Information Systems, 2017

In this paper, we propose a novel edge modification technique that better preserves the communities of a graph while anonymizing it. By maintaining the core number sequence of a graph, its coreness, we retain most of the information contained in the network while allowing changes in the degree sequence, i. e. obfuscating the visible data an attacker has access to. We reach a better trade-off between data privacy and data utility than with existing methods by capitalizing on the slack between apparent degree (node degree) and true degree (node core number). Our extensive experiments on six diverse standard network datasets support this claim. Our framework compares our method to other that are used as proxies for privacy protection in the relevant literature. We demonstrate that our method leads to higher data utility preservation, especially in clustering, for the same levels of randomization and k-anonymity.

Recommended citation: Rousseau, F., Casas-Roma, J. & Vazirgiannis, M. Community-preserving anonymization of graphs. Knowl Inf Syst 54, 315–343 (2018). https://doi.org/10.1007/s10115-017-1064-y
Download Paper

Scalable non-deterministic clustering-based k-anonymization for rich networks

Published in International Journal of Information Security, 2018

In this paper, we tackle the problem of graph anonymization in the context of privacy-preserving social network mining. We present a greedy and non-deterministic algorithm to achieve k-anonymity on labeled and undirected networks. Our work aims to create a scalable algorithm for real-world big networks, which runs in parallel and uses biased randomization for improving the quality of the solutions. We propose new metrics that consider the utility of the clusters from a recommender system point of view. We compare our approach to SaNGreeA, a well-known state-of-the-art algorithm for k-anonymity generalization. Finally, we have performed scalability tests, with up to 160 machines within the Hadoop framework, for anonymizing a real-world dataset with around 830 K nodes and 63 M relationships, demonstrating our method’s utility and practical applicability.

Recommended citation: Ros-Martín, M., Salas, J. & Casas-Roma, J. Scalable non-deterministic clustering-based k-anonymization for rich networks. Int. J. Inf. Secur. 18, 219–238 (2019). https://doi.org/10.1007/s10207-018-0409-1
Download Paper

k-Degree anonymity on directed networks

Published in Knowledge and Information Systems, 2018

In this paper, we consider the problem of anonymization on directed networks. Although there are several anonymization methods for networks, most of them have explicitly been designed to work with undirected networks and they cannot be straightforwardly applied when they are directed. Moreover, ignoring the direction of the edges causes important information loss on the anonymized networks in the best case. In the worst case, the direction of the edges may be used for reidentification, if it is not considered in the anonymization process. Here, we propose two different models for k-degree anonymity on directed networks, and we also present algorithms to fulfill these k-degree anonymity models. Given a network G, we construct a k-degree anonymous network by the minimum number of edge additions. Our algorithms use multivariate micro-aggregation to anonymize the degree sequence, and then, they modify the graph structure to meet the k-degree anonymous sequence. We apply our algorithms to several real datasets and demonstrate their efficiency and practical utility.

Recommended citation: Casas-Roma, J., Salas, J., Malliaros, F.D. et al. k-Degree anonymity on directed networks. Knowl Inf Syst 61, 1743–1768 (2019). https://doi.org/10.1007/s10115-018-1251-5
Download Paper

Forecasting Water Levels of Catalan Reservoirs

Published in International Conference on Modeling Decisions for Artificial Intelligence, 2019

Reservoirs are largely natural or artificial lakes used as a source of water supply for society daily applications. However, reservoirs are limited natural resources which water levels vary according to annual rainfalls and other natural events. Therefore, prediction techniques are helpful to manage the water used more efficiently. This paper compares state-of-the-art methods to predict the water level in Catalan reservoirs comparing two approaches: using the water level uniquely, uni-variant, and adding meteorological data, multi-variant. With respect to relate works, our contribution includes a longer times series prediction keeping a high precision. The results return that combining Support Vector Machine and the multi-variant approach provides the highest precision with an R^2 value of 0.99.

Recommended citation: Parada, R., Font, J., Casas-Roma, J. (2019). Forecasting Water Levels of Catalan Reservoirs. In: Torra, V., Narukawa, Y., Pasi, G., Viviani, M. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2019. Lecture Notes in Computer Science(), vol 11676. Springer, Cham. https://doi.org/10.1007/978-3-030-26773-5_15
Download Paper

Predicting Energy Generation Using Forecasting Techniques in Catalan Reservoirs

Published in Energies, 2019

Reservoirs are natural or artificial lakes used as a source of water supply for society daily applications. In addition, hydroelectric power plants produce electricity while water flows through the reservoir. However, reservoirs are limited natural resources since water levels vary according to annual rainfalls and other natural events, and consequently, the energy generation. Therefore, forecasting techniques are helpful to predict water level, and thus, electricity production. This paper examines state-of-the-art methods to predict the water level in Catalan reservoirs comparing two approaches: using the water level uniquely, uni-variant; and adding meteorological data, multi-variant. With respect to relating works, our contribution includes a longer times series prediction keeping a high precision. The results return that combining Support Vector Machine and the multi-variant approach provides the highest precision with an R^2 value of 0.99.

Recommended citation: Parada, R., Font, J., & Casas-Roma, J. (2019). Predicting Energy Generation Using Forecasting Techniques in Catalan Reservoirs. Energies, 12(10), 1832. https://doi.org/10.3390/en12101832
Download Paper

Towards the Analysis of How Anonymization Affects Usefulness of Health Data in the Context of Machine Learning

Published in 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), 2019

The volume and quality of patient data stored and collected have drastically grown in the last years. Such data can be analyzed by machine learning algorithms to improve health and well-being. However, while the distribution of data is benefitial, it should be performed in a way that preserves patient privacy. It would be expected to obtain useful information from the use of machine learning algorithms applied to both anonymized and non-anonymized datasets. However, those algorithms can generate lower quality results (even invalid ones) due to information loss during the anonymization process. We aim to analyze the relationship between anonymization and data utility/information loss, through the use of different algorithms and information loss metrics. With that aim, we plan to 1) analyze how real algorithms used on real data are affected by different anonymization techniques; 2) to use the lessons learned to design useful metrics for measuring the information loss after annonymization; and 3) to validate the proposed metrics by testing them in other environments with different types of data. The expected contributions of the research will be to obtain more information about how anonymization techniques affect the data usefulness, together with additional knowledge about the more suitable machine learning algorithms to be used to anonymized data, and a set of metrics to measure the usefulness of anonymized data would be developed.

Recommended citation: F. Carmona, J. Conesa and J. Casas-Roma, "Towards the Analysis of How Anonymization Affects Usefulness of Health Data in the Context of Machine Learning," 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), Cordoba, Spain, 2019, pp. 604-608, https://doi.org/10.1109/CBMS.2019.00126
Download Paper

Towards an Analysis of Post-Transcriptional Gene Regulation in Psoriasis via microRNAs using Machine Learning Algorithms

Published in 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), 2019

Single Nucleotide Polymorphisms (SNPs) are the most common inter-individual variations in the human being. They gained popularity with the irruption of Next Generation Sequencing (NGS) as disease biomarkers for diagnosis and/or prognosis using Genome-Wide Association Study. They are along the genome but mostly in the non-coding regions. In these cases, SNPs may affect regulatory regions, such as promoters, enhancers or microRNA (miRNA) binding sites. miRNAs are short non-coding RNAs, that are estimated to regulate up to 60% of gene expression at the post-transcriptional level. It is well known they are implied in many diseases by misregulating the expression of genes. New computational technologies allow extracting more information from RNA-Seq data, being able not only to measure the gene expression but also mapping SNPs on the genome. To understand and model the effects of this type of RNAs in disease phenotype, machine learning algorithms will be trained using SNPs located in the 3 UTR (UnTranslated Region) of deregulated genes to find biomarkers and describe the mechanism of action.

Recommended citation: J. Carrere-Molina, L. Subirats and J. Casas-Roma, "Towards an Analysis of Post-Transcriptional Gene Regulation in Psoriasis via microRNAs using Machine Learning Algorithms," 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS), Cordoba, Spain, 2019, pp. 600-603, https://doi.org/10.1109/CBMS.2019.00125
Download Paper

An evaluation of vertex and edge modification techniques for privacy-preserving on graphs

Published in Journal of Ambient Intelligence and Humanized Computing, 2019

Noise is added by privacy-preserving methods or anonymization processes to prevent adversaries from re-identifying users in anonymous networks. The noise introduced by the anonymization steps may also affect the data, reducing its utility for subsequent data mining processes. Graph modification approaches are one of the most used and well-known methods to protect the privacy of the data. These methods convert the data by means of vertex and edge modifications before releasing the perturbed data. In this paper we want to analyze the vertex and edge modification techniques found in literature covering this topic. We empirically evaluate the information loss introduced by each of these methods not only using generic metrics related to graph properties, but also using some specific metrics related to real graph-mining tasks. We want to point out how these methods affect the main properties and characteristics of the network, since it will help us to choose the best one to achieve a desired privacy level while preserving data utility.

Recommended citation: Casas-Roma, J. An evaluation of vertex and edge modification techniques for privacy-preserving on graphs. J Ambient Intell Human Comput 14, 15109–15125 (2023). https://doi.org/10.1007/s12652-019-01363-6
Download Paper

DUEF-GA: data utility and privacy evaluation framework for graph anonymization

Published in International Journal of Information Security, 2019

Anonymization of graph-based data is a problem which has been widely studied over the last years, and several anonymization methods have been developed. Information loss measures have been used to evaluate data utility and information loss in the anonymized graphs. However, there is no consensus about how to evaluate data utility and information loss in privacy-preserving and anonymization scenarios, where the anonymous datasets were perturbed to hinder re-identification processes. Authors use diverse metrics to evaluate data utility and, consequently, it is complex to compare different methods or algorithms in the literature. In this paper, we propose a framework to evaluate and compare anonymous datasets in a common way, providing an objective score to clearly compare methods and algorithms. Our framework includes metrics based on generic information loss measures, such as average distance or betweenness centrality and also task-specific information loss measures, such as community detection or information flow. Additionally, we provide some metrics to examine re-identification and risk assessment. We demonstrate that our framework could help researchers and practitioners to select the best parametrization and/or algorithm to reduce information loss and maximize data utility.

Recommended citation: Casas-Roma, J. DUEF-GA: data utility and privacy evaluation framework for graph anonymization. Int. J. Inf. Secur. 19, 465–478 (2020). https://doi.org/10.1007/s10207-019-00469-4
Download Paper

Modified connectivity of vulnerable brain nodes in multiple sclerosis, their impact on cognition and their discriminative value

Published in Scientific Reports, 2019

Brain structural network modifications in multiple sclerosis (MS) seem to be clinically relevant. The discriminative ability of those changes to identify MS patients or their cognitive status remains unknown. Therefore, this study aimed to investigate connectivity changes in MS patients related to their cognitive status, and to define an automatic classification method to classify subjects as patients and healthy volunteers (HV) or as cognitively preserved (CP) and impaired (CI) patients. We analysed structural brain connectivity in 45 HV and 188 MS patients (104 CP and 84 CI). A support vector machine with k-fold cross-validation was built using the graph metrics features that best differentiate the groups (p < 0.05). Local efficiency (LE) and node strength (NS) network properties showed the largest differences: 100% and 69.7% of nodes had reduced LE and NS in CP patients compared to HV. Moreover, 55.3% and 57.9% of nodes had decreased LE and NS in CI compared to CP patients, in associative multimodal areas. The classification method achieved an accuracy of 74.8–77.2% to differentiate patients from HV, and 59.9–60.8% to discriminate CI from CP patients. Structural network integrity is widely reduced and worsens as cognitive function declines. Central network properties of vulnerable nodes can be useful to classify MS patients.

Recommended citation: Solana, E., Martinez-Heras, E., Casas-Roma, J. et al. Modified connectivity of vulnerable brain nodes in multiple sclerosis, their impact on cognition and their discriminative value. Sci Rep 9, 20172 (2019). https://doi.org/10.1038/s41598-019-56806-z
Download Paper

Reducing the Learning Domain by Using Image Processing to Diagnose COVID-19 from X-Ray Image

Published in 24th International Conference of the Catalan Association for Artificial Intelligence, 2022

Over the last months, dozens of artificial intelligence (AI) solutions for COVID-19 diagnosis based on chest X-ray image analysis have been proposed. All of them with very impressive sensitivity and specificity results. However, its generalization and translation to the clinical practice are rather challenging due to the discrepancies between domain distributions when training and test data come from different sources. Consequently, applying a trained model on a new data set may have a problem with domain adaptation leading to performance degradation. This research aims to study the impact of image pre-processing on pre-trained deep learning models to reduce the learning domain. The dataset used in this research consists of 5,000 X-ray images obtained from different sources under two categories: negative and positive COVID-19 detection. We implemented transfer learning in 3 popular convolutional neural networks (CNNs), including VGG16, VGG19, and DenseNet169. We repeated the study following the same structure for original and pre-processed images. The pre-processing method is based on the Contrast Limited Adaptive Histogram Equalization (CLAHE) filter application and image registration. After evaluating the models, the CNNs that have been trained with pre-processed images obtained an accuracy score up to 1.2% better than the unprocessed ones. Furthermore, we can observe that in the 3 CNN models, the repeated misclassified images represent 40.9% (207/506) of the original image dataset with the erroneous result. In pre-processed ones, this percentage is 48.9% (249/509). In conclusion, image processing techniques can help to reduce the learning domain for deep learning applications.

Recommended citation: Abad, M., Casas-Roma, J., & Prados, F. (2022). Reducing the Learning Domain by Using Image Processing to Diagnose COVID-19 from X-Ray Image. In Frontiers in Artificial Intelligence and Applications. IOS Press. https://doi.org/10.3233/faia220343
Download Paper

Applying multilayer analysis to morphological, structural, and functional brain networks to identify relevant dysfunction patterns

Published in Network Neuroscience, 2022

In recent years, research on network analysis applied to MRI data has advanced significantly. However, the majority of the studies are limited to single networks obtained from resting-state fMRI, diffusion MRI, or gray matter probability maps derived from T1 images. Although a limited number of previous studies have combined two of these networks, none have introduced a framework to combine morphological, structural, and functional brain connectivity networks. The aim of this study was to combine the morphological, structural, and functional information, thus defining a new multilayer network perspective. This has proved advantageous when jointly analyzing multiple types of relational data from the same objects simultaneously using graph- mining techniques. The main contribution of this research is the design, development, and validation of a framework that merges these three layers of information into one multilayer network that links and relates the integrity of white matter connections with gray matter probability maps and resting-state fMRI. To validate our framework, several metrics from graph theory are expanded and adapted to our specific domain characteristics. This proof of concept was applied to a cohort of people with multiple sclerosis, and results show that several brain regions with a synchronized connectivity deterioration could be identified.

Recommended citation: Jordi Casas-Roma, Eloy Martinez-Heras, Albert Solé-Ribalta, Elisabeth Solana, Elisabet Lopez-Soley, Francesc Vivó, Marcos Diaz-Hurtado, Salut Alba-Arbalat, Maria Sepulveda, Yolanda Blanco, Albert Saiz, Javier Borge-Holthoefer, Sara Llufriu, Ferran Prados; Applying multilayer analysis to morphological, structural, and functional brain networks to identify relevant dysfunction patterns. Network Neuroscience 2022; 6 (3): 916–933. doi: https://doi.org/10.1162/netn_a_00258
Download Paper

Recent advances in the longitudinal segmentation of multiple sclerosis lesions on magnetic resonance imaging: a review

Published in Neuroradiology, 2022

Multiple sclerosis (MS) is a chronic autoimmune disease characterized by demyelinating lesions that are often visible on magnetic resonance imaging (MRI). Segmentation of these lesions can provide imaging biomarkers of disease burden that can help monitor disease progression and the imaging response to treatment. Manual delineation of MRI lesions is tedious and prone to subjective bias, while automated lesion segmentation methods offer objectivity and speed, the latter being particularly important when analysing large datasets. Lesion segmentation can be broadly categorised into two groups: cross-sectional methods, which use imaging data acquired at a single time-point to characterise MRI lesions; and longitudinal methods, which use imaging data from the same subject acquired at two or more different time-points to characterise lesions over time. The main objective of longitudinal segmentation approaches is to more accurately detect the presence of new MS lesions and the growth or remission of existing lesions, which may be effective biomarkers of disease progression and treatment response. This paper reviews articles on longitudinal MS lesion segmentation methods published over the past 10 years. These are divided into traditional machine learning methods and deep learning techniques. PubMed articles using longitudinal information and comparing fully automatic two time point segmentations in any step of the process were selected. Nineteen articles were reviewed. There is an increasing number of deep learning techniques for longitudinal MS lesion segmentation that are promising to help better understand disease progression.

Recommended citation: Diaz-Hurtado, M., Martínez-Heras, E., Solana, E. et al. Recent advances in the longitudinal segmentation of multiple sclerosis lesions on magnetic resonance imaging: a review. Neuroradiology 64, 2103–2117 (2022). https://doi.org/10.1007/s00234-022-03019-3
Download Paper

CO2 impact on convolutional network model training for autonomous driving through behavioral cloning

Published in Advanced Engineering Informatics, 2023

Autonomous driving and the machine learning (ML) models developed to achieve it have grown rapidly in importance and complexity respectively, and with them has also grown their carbon footprint due to long training times. Given the importance of climate change, it should be necessary to include the CO2 impact of ML models explicitly to encourage competition on more than just model quality. This work presents the implementation of two different Convolutional Neural Network (CNN) training approaches, used for autonomous driving by behavioral cloning in a simulated environment and compares their impact on the CO2 footprint. Using a cloud execution environment and driving data, previously obtained by applying an end-to-end deep learning technique, the first implemented training approach is a classical approach that uses an image generator that carries out the pre-processing and data augmentation during training. The contribution proposed in this paper is a second approach that improves the training by decreasing the training time, carrying out the data augmentation and pre-processing tasks before training the model, storing the result in RAM and then starting the training. The new approach to training presented in this article finishes the training approximately 38 times faster and reduces the carbon footprint impact by approximately 96%. In absolute values, this is a reduction from an average value of approximately 0.1643 (kg) to 0.007 (kg). To estimate the impact of CO2, the hardware used for the project, the training time and the cloud service provider were all taken into account.

Recommended citation: Fernando Sevilla Martínez, Raúl Parada, Jordi Casas-Roma, (2023). CO2 impact on convolutional network model training for autonomous driving through behavioral cloning, Advanced Engineering Informatics, Volume 56, N. 101968, https://doi.org/10.1016/j.aei.2023.101968
Download Paper

A synthetic data generation system for myalgic encephalomyelitis/chronic fatigue syndrome questionnaires

Published in Scientific Reports, 2023

Artificial intelligence or machine-learning-based models have proven useful for better understanding various diseases in all areas of health science. Myalgic Encephalomyelitis or chronic fatigue syndrome (ME/CFS) lacks objective diagnostic tests. Some validated questionnaires are used for diagnosis and assessment of disease progression. The availability of a sufficiently large database of these questionnaires facilitates research into new models that can predict profiles that help to understand the etiology of the disease. A synthetic data generator provides the scientific community with databases that preserve the statistical properties of the original, free of legal restrictions, for use in research and education. The initial databases came from the Vall Hebron Hospital Specialized Unit in Barcelona, Spain. 2522 patients diagnosed with ME/CFS were analyzed. Their answers to questionnaires related to the symptoms of this complex disease were used as training datasets. They have been fed for deep learning algorithms that provide models with high accuracy [0.69–0.81]. The final model requires SF-36 responses and returns responses from HAD, SCL-90R, FIS8, FIS40, and PSQI questionnaires. A highly reliable and easy-to-use synthetic data generator is offered for research and educational use in this disease, for which there is currently no approved treatment.

Recommended citation: Lacasa, M., Prados, F., Alegre, J. et al. A synthetic data generation system for myalgic encephalomyelitis/chronic fatigue syndrome questionnaires. Sci Rep 13, 14256 (2023). https://doi.org/10.1038/s41598-023-40364-6
Download Paper

Generalizable disease detection using model ensemble on chest X-ray images

Published in Scientific Reports, 2024

In the realm of healthcare, the demand for swift and precise diagnostic tools has been steadily increasing. This study delves into a comprehensive performance analysis of three pre-trained convolutional neural network (CNN) architectures: ResNet50, DenseNet121, and Inception-ResNet-v2. To ensure the broad applicability of our approach, we curated a large-scale dataset comprising a diverse collection of chest X-ray images, that included both positive and negative cases of COVID-19. The models’ performance was evaluated using separate datasets for internal validation (from the same source as the training images) and external validation (from different sources). Our examination uncovered a significant drop in network efficacy, registering a 10.66% reduction for ResNet50, a 36.33% decline for DenseNet121, and a 19.55% decrease for Inception-ResNet-v2 in terms of accuracy. Best results were obtained with DenseNet121 achieving the highest accuracy at 96.71% in internal validation and Inception-ResNet-v2 attaining 76.70% accuracy in external validation. Furthermore, we introduced a model ensemble approach aimed at improving network performance when making inferences on images from diverse sources beyond their training data. The proposed method uses uncertainty-based weighting by calculating the entropy in order to assign appropriate weights to the outputs of each network. Our results showcase the effectiveness of the ensemble method in enhancing accuracy up to 97.38% for internal validation and 81.18% for external validation, while maintaining a balanced ability to detect both positive and negative cases.

Recommended citation: Abad, M., Casas-Roma, J. & Prados, F. Generalizable disease detection using model ensemble on chest X-ray images. Sci Rep 14, 5890 (2024). https://doi.org/10.1038/s41598-024-56171-6
Download Paper

Spiking neural networks for autonomous driving: A review

Published in Engineering Applications of Artificial Intelligence, 2024

The rapid progress of autonomous driving (AD) has triggered a surge in demand for safer and more efficient autonomous vehicles, owing to the intricacy of modern urban environments. Traditional approaches to autonomous driving have heavily relied on conventional machine learning methodologies, particularly convolutional neural networks (CNNs) and recurrent neural networks (RNNs), for tasks such as perception, decision-making, and control. Presently, major companies such as Tesla, Waymo, Uber, and Volkswagen Group (VW) leverage neural networks for advanced perception and autonomous decision-making. However, concerns have been raised about the escalating computational requirements of training these neural models, primarily in terms of energy consumption and environmental impact. In the situation of optimisation and sustainability, Spiking Neural Networks (SNNs), inspired by the temporal processing of the human brain, have come forth as a third-generation of neural networks, famed for their energy efficiency, potential for handling real-time driving scenarios and processing temporal information efficiently. However, SNNs have not yet achieved the performance levels of their predecessors in critical AD tasks, partly due to the intricate dynamics of neurons, their non-differentiable spike operations, and the lack of specialised benchmark workloads and datasets, among others. This paper examines the principles, models, learning rules, and recent advancements of SNNs in the AD domain. Neuromorphic hardware, hand in hand with SNNs, shows potential but has challenges in accessibility, cost, integration, and scalability. This examination aims to bridge gaps by providing a comprehensive understanding of SNNs in the AD field. It emphasises the role of SNNs in shaping the future of AD while considering optimisation and sustainability.

Recommended citation: Fernando S. Martínez, Jordi Casas-Roma, Laia Subirats, Raúl Parada, (2024). Spiking neural networks for autonomous driving: A review. Engineering Applications of Artificial Intelligence, Volume 138, Part B, N. 109415, https://doi.org/10.1016/j.engappai.2024.109415
Download Paper

Graph Neural Networks for Multimodal Brain Connectivity Analysis in Multiple Sclerosis

Published in Graph-Based Representations in Pattern Recognition, 2025

Accurately predicting subject status from brain network data is a complex task that requires advanced machine learning techniques. In this work, we propose a comprehensive methodology and pipeline for applying supervised graph learning models, specifically Graph Neural Networks, to this task using brain network information derived from diffusion tensor imaging, gray matter and resting-state functional MRI adjacency matrices. Our approach includes a graph pruning step to retain the most relevant edges while preserving crucial information, the generation of node features to enhance graph representations, the creation of synthetic data to balance the dataset and improve training, and the design and training of GNN models for both multi-class and binary classification tasks. Experimental results in a cohort of people with multiple sclerosis and healthy volunteers demonstrate that our methodology effectively captures meaningful patterns in brain graphs, leading to improved classification performance.

Recommended citation: Subirà-Cribillers, M., Solé-Casaramona, J., Lladós, J., Casas-Roma, J. (2025). Graph Neural Networks for Multimodal Brain Connectivity Analysis in Multiple Sclerosis. In: Brun, L., Carletti, V., Bougleux, S., Gaüzère, B. (eds) Graph-Based Representations in Pattern Recognition. GbRPR 2025. Lecture Notes in Computer Science, vol 15727. Springer, Cham. https://doi.org/10.1007/978-3-031-94139-9_9
Download Paper

Unsupervised Cluster Analysis Reveals Distinct Subtypes of ME/CFS Patients Based on Peak Oxygen Consumption and SF-36 Scores

Published in Clinical Therapeutics, 2025

Myalgic encephalomyelitis, commonly referred to as chronic fatigue syndrome (ME/CFS), is a severe, disabling chronic disease and an objective assessment of prognosis is crucial to evaluate the efficacy of future drugs. Attempts are ongoing to find a biomarker to objectively assess the health status of (ME/CFS), patients. This study therefore aims to demonstrate that oxygen consumption is a biomarker of ME/CFS provides a method to classify patients diagnosed with ME/CFS based on their responses to the Short Form-36 (SF-36) questionnaire, which can predict oxygen consumption using cardiopulmonary exercise testing (CPET).

Recommended citation: Marcos Lacasa, Patricia Launois, Ferran Prados, José Alegre, Jordi Casas-Roma, (2023). Unsupervised Cluster Analysis Reveals Distinct Subtypes of ME/CFS Patients Based on Peak Oxygen Consumption and SF-36 Scores. Clinical Therapeutics, Volume 45, Issue 12, pp. 1228-1235. https://doi.org/10.1016/j.clinthera.2023.09.007
Download Paper

talks

Privacidad de datos (Data privacy)

Published:

En este post os presento un resumen y la presentación completa de una de las charlas del UOC Data Day, celebrado el día 21 de junio de 2017 en Madrid sobre la privacidad de datos (Data Privacy).

Brain Networks and Medical Image Analysis

Published:

In this seminar, the research conducted over the past years will be presented, along with new projects or ideas to be implemented in the future. The objective is to share prior experiences and promote synergies with other researchers and groups within the Computer Vision Center.

teaching

Teaching Book Chapters

Books, , 2023

  • Casas-Roma, J. (2016). Análisis de redes (book chapter in Spanish). Book: Fundamentos de Data Science. Barcelona, Spain: Oberta UOC Publishing, SL.
  • Casas-Roma, J. (2016). Open Data: información de la ciudad en abierto (book chapter in Spanish). Book: Tecnologías de la Smart City. Barcelona, Spain: Oberta UOC Publishing, SL.
  • Borrell Viader, J.; Casas Roma, J.; Garrigues, C.; Pérez Solà, C.; Perramon Tornil, X.; Rifà, H; Robles Martínez, S. (2016). Seguridad y privacidad en las smart cities (book chapter in Spanish). Book: Tecnologías de la Smart City. Barcelona, Spain: Oberta UOC Publishing, SL.
  • Casas-Roma, J.; Melià Seguí, J. (2016). Big data, redes sociales e inteligencia contextual (book chapter in Spanish). Book: Herramientas para gestión smart de la ciudad. Barcelona, Spain: Oberta UOC Publishing, SL.
  • Casas-Roma, J. (2015). Almacenamiento y explotación de datos en el comercio electrónico (book chapter in Spanish). Book: Tecnologías del comercio electrónico. Barcelona, Spain: Oberta UOC Publishing, SL.
  • Casas-Roma, J. (2012). Introducción al diseño de bases de datos (book chapter in Spanish). Book: Diseño de Bases de Datos. Barcelona, Spain: Eureca Media, SL.
  • Casas-Roma, J. (2012). Diseño conceptual de bases de datos (book chapter in Spanish). Book: Diseño de Bases de Datos. Barcelona, Spain: Eureca Media, SL.
  • Casas-Roma, J. (2012). Diseño físico de bases de datos (book chapter in Spanish). Book: Diseño de Bases de Datos. Barcelona, Spain: Eureca Media, SL.
  • Casas-Roma, J. (2004). Modelos de diseño de las TIC (book chapter in Spanish). Book: Máster Internacional en e-Learning. Barcelona, Spain: Eureca Media, SL.

Teaching Books

Books, , 2024

  • Bosch Rué, A.; Casas Roma, J.; Lozano Bagén, T. (2020). Deep learning: Principios y fundamentos (book in Spanish. Available https://hdl.handle.net/10609/153482). Barcelona, Spain: Editorial UOC. 260 pp. ISBN: 9788491806561
  • Casas Roma, J.; Nin Guerrero, J.; Julbe López, F. (2019). Big data: Análisis de datos en entornos masivos (book in Spanish. Available https://hdl.handle.net/10609/153458). Barcelona, Spain: Editorial UOC. 287 pp. ISBN: 9788491804727
  • Gironés Roig, J.; Casas-Roma, J.; Minguillón Alfonso, J.; Caihuelas Quiles, R. (2017). Minería de datos: Modelos y algoritmos (book in Spanish. Available https://hdl.handle.net/10609/153501). Barcelona, Spain: Editorial UOC. 274 pp. ISBN: 9788491169031.
  • Casas-Roma, J.; Romero Tris, C. (2017). Privacidad y anonimización de datos (book in Spanish. Available here). Barcelona, Spain: Editorial UOC. 150 pp. ISBN: 9788491169383.
  • Pérez-Solà, C.; Casas-Roma, J. (2016). Análisis de datos de redes sociales (book in Spanish. Available https://hdl.handle.net/10609/153439). Barcelona, Spain: Editorial UOC. 182 pp. ISBN: 9788491163664.
  • Casas-Roma, J.; Conesa Caralt, J. (2013). Diseño conceptual de bases de datos en UML (book in Spanish. Available here). Barcelona, Spain: Editorial UOC. 152 pp. ISBN: 9788490297698.

Teaching Activities

Teaching courses, Universitat Autònoma de Barcelona (UAB), Department of Computer Science, 2025

Associate professor at Faculty of Computer Science, Multimedia and Telecommunications at Universitat Oberta de Catalunya (UOC).