Supplementary Material - pilot data processing and visualization
PFAS exposure of humans, animals and the environment: protocol of an evidence review map and bibliometric analysis
In this Rmarkdown document we provide preliminary workflow and code for:
Objective 1. Mapping: What evidence on PFAS has been synthesized? We aim to reveal what areas in the realm of PFAS health-related and environmental research have been synthesized the most and where syntheses of evidence are lacking.
Objective 2. Critical appraisal: How robust are syntheses of PFAS evidence? We will examine the included syntheses for their reporting quality and potential biases, in order to assess reliability of reviews’ conclusions, reveal syntheses that should be re-done, and highlight the aspects of review methodology that need to be improved.
Objective 3. Bibliometrics: How is synthesized PFAS evidence connected? We will examine which countries and institutions are mostly involved in secondary PFAS research and what do the networks between these institutions look like.
Cross-Objective Insights: We will investigate how review type, indicators of review quality or transparency are related other review properties, such as publication date, main review topic and subject, etc.
Note: The code and all created outputs are based on a pilot set of data extracted from only 8 papers that fulfilled our inclusion criteria during the piloting stage. The code will need to be adjusted and expanded for the final data set (e.g., additional data cleaning steps or adjustments to the plot styling and contents). These changes will be documented via tracking on GitHub.
R code processing messages and warnings are suppressed from the output document.
Setup
Set global code chunk parameters and load packages.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.4 ✓ purrr 0.3.4
## ✓ tibble 3.1.2 ✓ dplyr 1.0.5
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(here)
## here() starts at /Users/itchyshin/Dropbox/Local_git_repos/UmbrellaReview_PFAS_pilot
library(stringr)
library(knitr)
library(formatR)
library(forcats)
library(ggplot2)
library(hrbrthemes) #for ggplot2
library(patchwork) #for ggplot2
library(bibliometrix)
## To cite bibliometrix in publications, please use:
##
## Aria, M. & Cuccurullo, C. (2017) bibliometrix: An R-tool for comprehensive science mapping analysis,
## Journal of Informetrics, 11(4), pp 959-975, Elsevier.
##
##
## https://www.bibliometrix.org
##
##
## For information and bug reports:
## - Send an email to info@bibliometrix.org
## - Write a post on https://github.com/massimoaria/bibliometrix/issues
##
## Help us to keep Bibliometrix free to download and use by contributing with a small donation to support our research team (https://bibliometrix.org/donate.html)
##
##
## To start with the shiny web-interface, please digit:
## biblioshiny()
library(igraph)
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:dplyr':
##
## as_data_frame, groups, union
## The following objects are masked from 'package:purrr':
##
## compose, simplify
## The following object is masked from 'package:tidyr':
##
## crossing
## The following object is masked from 'package:tibble':
##
## as_data_frame
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
Load pilot data
Manually extracted pilot data is stored in four separate .csv files representing different aspects of the data (extracted via structured predefined Google Forms - one per table). Additional fixed table stores pre-extracted data on PFAS types included in this map.
Bibliographic data records are exported from Scopus (including cited references field) in .bib format and locally saved as scopus.bib. the pilot version of this file only includes bibliographic information of ‘systeamtic review family’-like reviews included in Objectives 1 and 2. In the actual work bibliographic information will be expanded to all reviews on PFAS exposure in humans, animals and the environment.
The study identity (authors, title) has been obscured in the pilot data extraction file. It will be public in the final data extraction files.
# Data from .csv file - main data sheet
mdata <- read_csv(here("data","ReviewMap_PFAS_main_pilot.csv"), skip = 0)
#dim(mdata) #8 rows 34 columns
# Data from .csv file - Species_info sheet
spdata <- read_csv(here("data","ReviewMap_PFAS_species_pilot.csv"), skip = 0)
#dim(spdata) #2 rows 4 columns
# change to long format (one species per row - one or multple rows per study):
spdata %>% select(-X4) %>% separate_rows(Species_scientific_name, sep='; ') -> spdata
#dim(spdata) #29 rows 3 columns
# Data from .csv file - PFAS_types sheet:
ptdata <- read_csv(here("data","ReviewMap_PFAS_types_pilot.csv"), skip = 0)
#dim(ptdata) #8 rows 4 columns
# change to long format (one type per row - one or multple rows per study):
ptdata %>% select(-X4) %>% separate_rows(PFAS_type, sep=', ') -> ptdata
#dim(ptdata) #70 rows 3 columns
# Data from .csv file - PFAS_info sheet:
pidata <- read_csv(here("data","PFAS_info.csv"), skip = 0)
#dim(pidata) #35 rows 7 columns
# Critical apprisal(AMSTAR2) from .csv file:
qdata <- read_csv(here("data", "ReviewMap_PFAS_quality_pilot.csv"), skip = 0)
#dim(qdata) #11 rows 38 columns
qdata %>% select(-X34) -> qdata #remove empty column
# Bibliometric data of included papers (with references) dwonloaded from Scopus in .bib format:
bib_sco <- convert2df(here("data","scopus.bib"), dbsource = "scopus", format = "bibtex")
##
## Converting your scopus collection into a bibliographic dataframe
##
## Done!
##
##
## Generating affiliation field tag AU_UN from C1: Done!
#dim(bib_sco) #8 rows 33 columns
Merge data stored across tables
The data stored across multiple flat spreadsheets needs merged for some of the analyses.
## Prepare - remove columns starting with "Checked"
# (info on data checking - these fields will be present in the final extraction sheet),
# also remove any other redundand columns:
# mdata %>% select(-starts_with("Checked")) -> mdata
# spdata %>% select(-starts_with("Checked")) -> spdata
# ptdata %>% select(-starts_with("Checked")) -> ptdata
# pidata %>% select(-starts_with("Checked")) -> pidata
## Merge main data with species info
mspdata <- left_join(mdata, spdata, by = "Study_ID")
#dim(mspdata) #8 rows 37 columns
#names(mspdata)
#str(mspdata)
#View(mspdata)
# Merge PFAS types data with PFAS info and then with main data
ptidata <- left_join(ptdata, pidata, by = "PFAS_type")
#dim(ptidata) #8 rows 10 columns
#names(ptidata)
#str(ptidata)
#View(ptidata)
mpidata <- left_join(mdata, ptidata, by = "Study_ID")
#dim(mpidata) #8 rows 43 columns
#names(mpidata)
#str(mpidata)
#View(mpidata)
Make a simple summary table of included papers
Note: authors names and paper titles are blinded for the protocol.
knitr::kable(select(mdata, Study_ID, Author_year, Paper_title), caption = "Table of included studies")
Table of included studies
S_001 |
Authors1_2017 |
Title1 |
S_002 |
Authors2_2013 |
Title2 |
S_003 |
Authors3_2019 |
Title3 |
S_004 |
Authors4_2021 |
Title4 |
S_005 |
Authors5_2017 |
Title5 |
S_006 |
Authors6_2017 |
Title6 |
S_007 |
Authors7_2016 |
Title7 |
S_008 |
Authors8_2020 |
Title8 |
Objective 1. Mapping: What evidence on PFAS has been synthesized?
Here we aim to reveal what areas in the realm of PFAS health-related and environmental research have been synthesized the most and where syntheses of evidence are lacking. Below, we include examples of potential data summaries and visualizations that will form the evidence review map. The pilot dataset presented here consists of 8 review studies.
Focus on PFAS, or POPs, and how many PFAS
#str(mdata)
mdata %>% count(PFAS_focus, PFAS_one_many) %>% arrange(desc(n)) %>%
filter(PFAS_focus != "NA") -> t_PFASfocus #filter out NA
ggplot(t_PFASfocus, aes(x = PFAS_focus, y = n)) +
theme_light() +
labs(title = expression("PFAS focus")) + #~bold(A.)~' Type and subject'
coord_flip() +
scale_fill_brewer() +
guides(fill=guide_legend(title="PFAS:")) +
geom_col(aes(fill = PFAS_one_many), width = 0.7) +
scale_y_continuous(name = "Article count") +
theme_light() +
theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"),
legend.title = element_text(size=10), legend.text=element_text(size = 10),
axis.title.x = element_text(size = 10), axis.title.y = element_blank())
#ggsave(here("plots","figure_PFAS_focus.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)
PFAS types
#str(ptidata)
ptidata %>% count(PFAS_type) %>% arrange(desc(n)) %>% filter(PFAS_type != "NA") -> t_PFAStype #filter out NA
t_PFAStype$PFAS_type <- factor(t_PFAStype$PFAS_type, levels = t_PFAStype$PFAS_type[order(t_PFAStype$n, decreasing = FALSE)])
PFAS_levels_order <- t_PFAStype$PFAS_type[order(t_PFAStype$n, decreasing = FALSE)] #save for another plot
# simple barplot with PFAS types
ggplot(t_PFAStype, aes(x = PFAS_type, y = n)) +
geom_col(aes(fill = ""), width = 0.7) +
theme_light() +
labs(title = expression("PFAS types reviewed")) + #~bold(A.)~' Type and subject'
coord_flip()+
scale_y_continuous(name = "Article count") +
scale_fill_manual(values = c("#919191")) +
theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())
#ggsave(here("plots","figure_PFAS_types.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)
Main review topics (MeSH terms)
Each records usually has multiple associated MeSH terms, which are separated with a semicolon (or a semicolon with a newline symbol).
The main topics of the article are denoted by the MeSH terms with asterisks *, but some records will not have main topics, so its safer to ignore this information.
Some MeSH headings also have qualifiers (sometimes called subheadings) - these are shown after a forward-slash and are used to “describe the specific aspects of the MeSH heading that are pertinent to the article” (https://www.nlm.nih.gov/bsd/indexing/training/SUB_010.html). MeSH terms and qualifiers could be analysed separately.
# change to long format (one species per row - one or multple rows per study):
mdata %>% select(MeSH_terms) %>% separate_rows(MeSH_terms, sep = ";\\\n") %>% separate_rows(MeSH_terms, sep = "; ") -> t_MESH
#bring to lower case and remove ending and trailing whitespace from character strings, single character vector:
tmesh <- str_to_lower(trimws(t_MESH$MeSH_terms))
#remove star character:
tmesh <- sub("[\\*]", "", tmesh)
#separate terms before and after "/" into separate columns (headings and quantifiers, respectively)
tmesh2 <- str_split_fixed(tmesh, " / ", n = 2)
#create a MeSH dataframe with three columns: terms, headings, qualifiers
t_MESH2 <- bind_cols(tmesh, tmesh2[,1], tmesh2[,2])
colnames(t_MESH2) <- c("MeSH_terms", "MeSH_headings", "MeSH_qualifiers")
#count frequencies of headings
t_MESH2 %>% select(MeSH_headings) %>% count(MeSH_headings) %>% arrange(desc(n)) %>%
filter(MeSH_headings != "NA") -> MeSH_headings_counts
#str(MeSH_headings_counts)
MeSH_headings_counts$MeSH_headings <- factor(MeSH_headings_counts$MeSH_headings, levels = MeSH_headings_counts$MeSH_headings[order(MeSH_headings_counts$n, decreasing = FALSE)])
MESH_headings_order <- MeSH_headings_counts$MeSH_headings[order(MeSH_headings_counts$n, decreasing = FALSE)] #save for the next plot
# simple barplot with top 10 MESH headings
ggplot(MeSH_headings_counts[1:10, ], aes(x = MeSH_headings, y = n)) +
geom_col(aes(fill = ""), width = 0.7) +
theme_light() +
labs(title = expression("Top 10 MeSH headings reviewed")) + #~bold(A.)~' Type and subject'
coord_flip() +
scale_y_continuous(name = "Article count") +
scale_fill_manual(values = c("#919191")) +
theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())
#ggsave(here("plots","figure_MeSH_headings.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)
#count frequencies of qualifiers
t_MESH2 %>% select(MeSH_qualifiers) %>% count(MeSH_qualifiers) %>% arrange(desc(n)) %>% filter(MeSH_qualifiers != "NA") %>% filter(MeSH_qualifiers != "")-> MeSH_qualifiers_counts
#str(MeSH_qualifiers_counts)
MeSH_qualifiers_counts$MeSH_qualifiers <- factor(MeSH_qualifiers_counts$MeSH_qualifiers, levels = MeSH_qualifiers_counts$MeSH_qualifiers[order(MeSH_qualifiers_counts$n, decreasing = FALSE)])
MESH_qualifiers_order <- MeSH_qualifiers_counts$MeSH_qualifiers[order(MeSH_qualifiers_counts$n, decreasing = FALSE)] #save for the next plot
# simple barplot with MESH qualifiers
ggplot(MeSH_qualifiers_counts[, ], aes(x = MeSH_qualifiers, y = n)) +
geom_col(aes(fill = ""), width = 0.7) +
theme_light() +
labs(title = expression("MeSH qualifiers reviewed")) + #~bold(A.)~' Type and subject'
coord_flip() +
scale_y_continuous(name = "Article count") +
scale_fill_manual(values = c("#919191")) +
theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())
#ggsave(here("plots","figure_MeSH_qualifiers.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)
Main review subjects
#str(mdata)
mdata %>% count(Human_animal_environment) %>% arrange(desc(n)) %>%
filter(Human_animal_environment != "NA") %>% filter(Human_animal_environment != "") -> t_subject #filter out NA
t_subject$Human_animal_environment <- factor(t_subject$Human_animal_environment, levels = unique(t_subject$Human_animal_environment[order(t_subject$n, decreasing = FALSE)]))
subjects_levels_order <- t_subject$Human_animal_environment[order(t_subject$n, decreasing = FALSE)] #save for another plot
# simple barplot with MESH_heading
ggplot(t_subject, aes(x = Human_animal_environment, y = n)) +
geom_col(aes(fill = ""), width = 0.7) +
theme_light() +
labs(title = expression("Main subjects reviewed")) + #~bold(A.)~' Type and subject'
coord_flip()+
scale_y_continuous(name = "Article count") +
scale_fill_manual(values = c("#919191")) +
theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())
#ggsave(here("plots","figure_subjects.pdf"), width = 4, height = 2, units = "cm", scale = 2, device = cairo_pdf)
Combining MeSH qualifiers with main review subjects
#str(mdata)
#subset main dat by main subject
#table(mdata$Human_animal_environment)
# change to long format (one species per row - one or multple rows per study):
mdata %>% group_by(Human_animal_environment) %>% select(MeSH_terms) %>%
separate_rows(MeSH_terms, sep = ";\\\n") %>% separate_rows(MeSH_terms, sep = "; ") -> t_MESHby
#str(t_MESHby)
tmeshby <- str_to_lower(trimws(t_MESHby$MeSH_terms)) #bring to lower case and remove ending and trailing whitespace from character strings, single character vector
tmeshby <- sub("[\\*]", "", tmeshby) #remove star character
#separate terms before and after "/" into separate columns (headings and quantifiers, respectively)
tmeshby2 <- str_split_fixed(tmeshby, " / ", n = 2)
#create a MeSH dataframe with three columns: terms, headings, qualifiers
t_MESHby2 <- bind_cols(t_MESHby$Human_animal_environment, tmeshby, tmeshby2[,1], tmeshby2[,2])
colnames(t_MESHby2) <- c("Subject", "MeSH_terms", "MeSH_headings", "MeSH_qualifiers")
#str(t_MESHby2)
t_MESHby2 %>% count(MeSH_qualifiers, Subject) %>% arrange(desc(n)) %>% filter(MeSH_qualifiers != "NA") %>%
filter(MeSH_qualifiers != "") -> t_topic_subject #filter out NA
t_topic_subject$Subject <- factor(t_topic_subject$Subject, levels = subjects_levels_order) #use order from the previous graph
t_topic_subject$MeSH_qualifiers <- factor(t_topic_subject$MeSH_qualifiers, levels = MESH_qualifiers_order) #use order from the previous graph
#str(t_topic_subject)
ggplot(t_topic_subject, aes(x = MeSH_qualifiers, y = n)) +
theme_light() +
labs(title = expression("MeSH qualifiers and subjects reviewed")) + #~bold(A.)~' Type and subject'
coord_flip() +
scale_fill_brewer() +
geom_col(aes(fill = Subject), width = 0.7) +
scale_y_continuous(name = "Article count") +
theme_light() +
theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"), legend.title = element_blank(), legend.text=element_text(size = 10), axis.title.x = element_text(size = 10), axis.title.y = element_blank())
#ggsave(here("plots","figure_MeSH_qualifiers_subjects.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)
PFAS focus by main review subject
# PFAS_one_many vs. Human_animal_environment contingency table
mdata %>% count(Human_animal_environment, PFAS_one_many) -> t1
ggplot(t1, aes(x = PFAS_one_many , y = n)) +
coord_flip() +
geom_col(aes(fill = Human_animal_environment), width = 0.7) +
theme_light() +
theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black")) +
ylab("Article count") +
xlab("PFAS") +
guides(fill=guide_legend(title="Subject:"))
#ggsave(here("plots","figure_PFAS_focus_subjects.pdf"), width = 4, height = 2, units = "cm", scale = 2, device = cairo_pdf)
PFAS types by main review subject
#str(mpidata)
mpidata %>% count(PFAS_type, Human_animal_environment) %>% arrange(desc(n)) %>% filter(PFAS_type != "NA") -> t_PFAStype #filter out NA
#t_PFAStype$PFAS_type <- factor(t_PFAStype$PFAS_type, levels = unique(t_PFAStype$PFAS_type[order(t_PFAStype$n, decreasing = FALSE)]))
t_PFAStype$PFAS_type <- factor(t_PFAStype$PFAS_type, levels = PFAS_levels_order) #use order from the previous graph
p_PFAStype <- ggplot(t_PFAStype, aes(x = PFAS_type, y = n)) +
theme_light() +
labs(title = expression("PFAS type and main subjects reviewed")) + #~bold(A.)~' Type and subject'
coord_flip() +
scale_fill_brewer() +
geom_col(aes(fill = Human_animal_environment), width = 0.7) +
scale_y_continuous(name = "Article count") +
theme_light() +
theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"), legend.title = element_blank(), legend.text=element_text(size = 10), axis.title.x = element_text(size = 10), axis.title.y = element_blank()) +
theme(legend.position = c(0.8, 0.5), legend.background = element_rect(fill = "white", color = "black")) #lace legend inside plot area
#theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())
p_PFAStype
#ggsave(here("plots","figure_PFAStype_subject.pdf"), width = 6, height = 8, units = "cm", scale = 2, device = cairo_pdf)
How many species and which ones
#str(mspdata)
mspdata %>% count(Species_scientific_name, Species_one_many) %>% arrange(desc(n)) %>% filter(Species_one_many != "NA") -> t_spfocus #filter out NA
ggplot(t_spfocus, aes(x = Species_scientific_name, y = n)) +
theme_light() +
labs(title = expression("Species focus")) + #~bold(A.)~' Type and subject'
coord_flip() +
scale_fill_brewer() +
guides(fill=guide_legend(title="Species:")) +
geom_col(aes(fill = Species_one_many), width = 0.7) +
scale_y_continuous(name = "Article count") +
theme_light() +
theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"), legend.title = element_text(size=10), legend.text=element_text(size = 10), axis.title.x = element_text(size = 10), axis.title.y = element_blank()) #+
#ggsave(here("plots","figure_species_focus.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)
mdata %>% count(Species_one_many, PFAS_one_many) -> t2 #make table of counts across two columns
t2$Species_one_many <- as.factor(t2$Species_one_many)
levels(t2$Species_one_many) <- c("Many", "One") #reorder levels for Species_one_many
t2 %>% filter(Species_one_many != "NA") -> t2 #filter out NA
# simple dot plot
ggplot(data = t2, aes(Species_one_many, PFAS_one_many)) + geom_point(aes(size = n), colour = "darkgrey") +
theme_classic() +
theme(axis.ticks.x=element_blank(), axis.text.x=element_text(size = 20, angle = 90, hjust = 0)) +
theme(axis.ticks.y=element_blank(), axis.text.y=element_text(size = 20)) +
theme(axis.line=element_blank()) +
theme(text = element_text(size = 22)) +
theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black")) +
theme(plot.margin = unit(c(10,10,10,10), "mm")) +
scale_size_continuous(range = c(1, 10)) +
#theme(axis.title = element_blank()) +
scale_x_discrete(position = "top") +
xlab("Species") +
ylab("PFAS")
#+ labs(title = "Counts of reviews by subject and type")
#ggsave(here("plots","figure_dotplot_PFAS_Species.pdf"), width = 4, height = 4, units = "cm", scale = 4, device = cairo_pdf)
Time-trends
mdata %>% count(Publication_year) %>% ggplot(aes(x = Publication_year, y = n)) +
geom_area(fill = '#919191', alpha = 1) +
geom_line(color = 'black', size = 1) +
geom_point(size=1, color = 'black') +
theme_minimal() +
scale_x_continuous(name = "", limits = c(2005, 2020)) +
scale_y_continuous(name = "Article count", limits = c(0, 5)) +
ggtitle("Publication year") +
theme(plot.title = element_text(hjust = 0.5))
#ggsave(here("plots","figure_area_publ_year.pdf"), width = 4, height = 3, units = "cm", scale = 2, device = cairo_pdf)
#modified graph with breakdown by Review_type_claimed (needs more data points to be displayed properly)
#mspdata %>% count(Publication_year, Review_type_claimed) %>% ggplot(aes(x = Publication_year, y = n, fill = Review_type_claimed, text = Review_type_claimed)) + geom_area()
#modified graph with breakdown by Species_higher_taxon (needs more data points to be displayed properly)
#mspdata %>% count(Publication_year, Species_higher_taxon) %>% ggplot(aes(x = Publication_year, y = n, fill = Species_higher_taxon, text = Species_higher_taxon)) + geom_area()
Objective 2. Critical appraisal: How robust are syntheses of PFAS evidence?
Here we will examine the included syntheses for their reporting quality and potential biases, in order to assess reliability of reviews’ conclusions, reveal syntheses that should be re-done, and highlight the aspects of review methodology that need to be improved.
Make a table with AMSTAR2 questions
The questions were stored separately in the first row of the file, and were loaded separately. We display them in a table, alongside the short labels that can be used in the plots. Full description of the questions and coding rules (from an adapted AMSTAR2 checklist; Shea et al. 2017; Samarasinghe et al. 2019), are provided in the protocol.
questions_list <- qdata %>% select(starts_with("Q") & !ends_with("_comment")) %>% colnames() #only select columns with assessment codes (data)
questions <- tibble(questions = questions_list, label = c("question and criteria", "a priori protocol", "included study designs", "comprehensive search", "selection duplicated", "extraction duplicated", "list of excluded studies", "summary of included studies", "critical appraisal", "sources of funding", "quantitative synthesis", "risk of bias", "effect of bias", "variability investigated", "publication bias", "conflict of interest"))
#str(questions)
knitr::kable(questions, caption = "Table. List of AMSTAR2 questions with labels for plotting")
Table. List of AMSTAR2 questions with labels for plotting
Q1. Are the research questions and inclusion criteria for the review clearly delineated? |
question and criteria |
Q2. Did the report of the review contain an explicit statement that the review methods were established prior to the conduct of the review and did the report justify any significant deviations from the protocol? |
a priori protocol |
Q3. Did the review authors explain their selection of the study designs for inclusion in the review? |
included study designs |
Q4. Did the review authors use a comprehensive literature search strategy? |
comprehensive search |
Q5. Did the review authors perform study selection in duplicate? |
selection duplicated |
Q6. Did the review authors perform data extraction in duplicate? |
extraction duplicated |
Q7. Did the review authors provide a list of excluded studies and justify the exclusions? |
list of excluded studies |
Q8. Did the review authors describe the included studies in adequate detail? |
summary of included studies |
Q9. Did the review authors use a satisfactory technique for assessing the risk of bias (RoB) in individual studies that were included in the review? |
critical appraisal |
Q10. Did the review authors report on the sources of funding for the studies included in the review? |
sources of funding |
Q11. If meta-analysis was performed did the review authors use appropriate methods for statistical combination of results? |
quantitative synthesis |
Q12. If meta-analysis was performed, did the review authors assess the potential impact of RoB in individual studies on the results of the meta-analysis or other evidence synthesis? |
risk of bias |
Q13. Did the review authors account for RoB in individual studies when interpreting/ discussing the results of the review? |
effect of bias |
Q14. Did the review authors provide a satisfactory explanation for, and discussion of, any heterogeneity observed in the results of the review? |
variability investigated |
Q15. If they performed quantitative synthesis did the review authors carry out an adequate investigation of publication bias (small study bias) and discuss its likely impact on the results of the review? |
publication bias |
Q16. Did the review authors report any potential sources of conflict of interest, including any funding they received for conducting the review? |
conflict of interest |
Summary plot for pilot AMSTAR2 data
AMSTAR2 assessment results as percentages of each answer score per question.
## prepare data
dim(qdata) #some empty rows to be removed
## [1] 11 33
#studies <- qdata$Author_year #normally would use this
#only select columns with assessment codes (drop comments and empty rows)
qtable <- qdata %>% filter(Study_ID!="NA") %>% select(starts_with("Q") & !ends_with("_comment"))
#simiplify column names to Q+number format
names(qtable) <- gsub("\\..*", "", names(qtable))
#simplify all the answers to short strings (version for a single column: gsub(" =.*", "", qtable$Q1)):
qtable <- apply(qtable, 2, function(y) as.character(gsub(" =.*", "", y)))
#save studies in a new vector
studies <- qdata$Study_ID[!is.na(qdata$Study_ID)]
#convert to long format data frame with Study_ID
qtable_long <- data.frame(study = rep(as.factor(studies), each = length(studies)),
question = rep(colnames(qtable), each = length(studies)),
score = as.vector(qtable), stringsAsFactors = TRUE) #make long format table
rownames(qtable_long) <- NULL
qtable_long$question <- factor(qtable_long$question, levels(qtable_long$question)[rev(c(1,9:16,2:8))]) #setting the order of levels - by Q-number
#add a column with verbal expression of scores:
qtable_long$score_word <- qtable_long$score
levels(qtable_long$score_word) <- c("No", "Partially", "Yes", "NotApplicable")
summaryplot <- ggplot(data = qtable_long) +
geom_bar(mapping = aes(x = question, fill = score_word), width = 0.7,
position = "fill", color = "black") +
coord_flip(ylim = c(0, 1)) +
guides(fill = guide_legend(reverse = TRUE)) +
scale_fill_manual("Risk of Bias",
labels = c(" Partially ",
" No ",
" Yes ",
" Not Applicable "),
values = c(Partially = "#E2DF07", #yellow
No = "#BF0000", #red
Yes = "#02C100", #green
NotApplicable = "grey")) +
scale_y_continuous(labels = scales::percent) +
theme(axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.ticks.y = element_blank(),
axis.text.x = element_text(size = 14, color = "black", hjust=0.5),
axis.text.y = element_text(size = 14, color = "black", hjust=0),
axis.line.x = element_line(colour = "black", size = 0.5, linetype = "solid"),
legend.position = "top",
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
legend.background = element_rect(linetype = "solid", colour = "grey"),
legend.title = element_blank(),
legend.key.size = unit(0.75, "cm"),
legend.text = element_text(size = 14))
## display plot
summaryplot
## save plot
#ggsave(here("plots","figure_AMSTAR2_summary_v01.pdf"), width = 10, height = 8, units = "cm", scale = 2, device = cairo_pdf)
Individual scores plot for pilot AMSTAR2 data
AMSTAR2 assessment results as a score per article per question.
## data prep
dim(qdata) #some empty rows to be removed
## [1] 11 33
#save studies in a new vector
studies <- qdata$Study_ID[!is.na(qdata$Study_ID)]
#studies <- qdata$Author_year #normally would use this
#only select columns with assessment codes (drop comments and empty rows)
qtable <- qdata %>% filter(Study_ID!="NA") %>% select(starts_with("Q") & !ends_with("_comment"))
#simiplify column names to Q+number format
names(qtable) <- gsub("\\..*", "", names(qtable))
#simplify all the answers to short strings (version for a single column: gsub(" =.*", "", qtable$Q1)):
qtable <- apply(qtable, 2, function(y) as.character(gsub(" =.*", "", y)))
## prepare data - change scores to verbal expressions:
qtable <- gsub("N/A", "NotApplicable", qtable)
qtable <- gsub("0.5", "Partially", qtable)
qtable <- gsub("1", "Yes", qtable)
qtable <- gsub("0", "No", qtable)
#convert to long format data frame with Study_ID
qtable_long <- data.frame(study = as.factor(studies),
question = rep(colnames(qtable), each = length(studies)),
measurement = as.vector(qtable), stringsAsFactors=FALSE)
rownames(qtable_long) = NULL
qtable_long$question <- as.factor(qtable_long$question)
qtable_long$question <- factor(qtable_long$question, levels(qtable_long$question)[c(1,9:16,2:8)]) #setting the order of levels - by Q-number
qtable_long$study <- factor(qtable_long$study, levels = unique(studies)[rev(order(unique(qtable_long$study)))]) #Re-order by study name (alphabetically)
##Make scoreplot
scoresplot <- ggplot(data = qtable_long, aes(y = study, x = question)) +
geom_tile(color="black", fill="white", size = 0.8) +
geom_point(aes(color=as.factor(measurement)), size=10) +
#geom_text(aes(label = measurement), size = 8) +
scale_x_discrete(position = "top") +
scale_color_manual(values = c("Partially" = "#E2DF07",
"No" = "#BF0000",
"Yes" = "#02C100",
"NotApplicable" = "grey")) +
theme_minimal() +
coord_equal() +
theme(axis.title.x = element_text(size = 14, color = "black", face = "italic"),
axis.title.y = element_blank(),
axis.ticks.y = element_blank(),
axis.text.y = element_text(size = 15, color = "black"),
axis.text.x = element_text(size = 14, color = "black", angle = 45, hjust=0),
legend.position = "bottom",
legend.background = element_rect(linetype = "solid", colour = "grey"),
legend.title = element_blank(),
legend.key.size = unit(0.75, "cm"),
legend.text = element_text(size = 12),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
plot.margin = unit(c(1,1,1,0), "cm") #set margins for saving into pdf file
)
scoresplot
## save plot
#ggsave(here("plots","figure_AMSTAR2_scores_v01.pdf"), width = 16, height = 12, units = "cm", scale = 2, device = cairo_pdf)
COI statement
Whether Conflict Of Interests statement is provided for individual reviews.
mdata %>% count(COI_statement) %>% arrange(desc(n)) %>% filter(COI_statement != "NA") -> t_COI_statement #filter out NA
ggplot(t_COI_statement, aes(x = COI_statement, y = n)) +
theme_light() +
theme(legend.position = "none", axis.title.y = element_blank()) +
labs(title = expression("COI statement availability")) + #~bold(A.)~' Type and subject'
coord_flip() +
geom_col(aes(fill = COI_statement), width = 0.7) +
scale_y_continuous(name = "Article count")
COI present
Whether potential Conflict Of Interests is acknowledged for individual reviews.
mdata %>% count(COI_present) %>% arrange(desc(n)) %>% filter(COI_present != "NA") -> t_COI_present #filter out NA
ggplot(t_COI_present, aes(x = COI_present, y = n)) +
theme_light() +
theme(legend.position = "none", axis.title.y = element_blank()) +
labs(title = expression("Conflict of Interest")) + #~bold(A.)~' Type and subject'
coord_flip() +
geom_col(aes(fill = COI_present), width = 0.7) +
scale_y_continuous(name = "Article count")
Funding statement
Whether funding statement is provided for individual reviews.
mdata %>% count(Funding_statement) %>% arrange(desc(n)) %>% filter(Funding_statement != "NA") -> t_Funding_statement #filter out NA
ggplot(t_Funding_statement, aes(x = Funding_statement, y = n)) +
theme_light() +
theme(legend.position = "none", axis.title.y = element_blank()) +
labs(title = expression("Funding statement availability")) + #~bold(A.)~' Type and subject'
coord_flip() +
geom_col(aes(fill = Funding_statement), width = 0.7) +
scale_y_continuous(name = "Article count")
Non-profit funding acknowledged
Whether funding from non-profit (NGO, government, universities, etc.) is acknowledged for individual reviews.
mdata <- mdata %>% rename(Nonprofit_funding = "Non-profit_funding")
mdata %>% count(Nonprofit_funding) %>% arrange(desc(n)) %>% filter(Nonprofit_funding != "NA") -> t_Nonprofit_funding #filter out NA
ggplot(t_Nonprofit_funding, aes(x = Nonprofit_funding, y = n)) +
theme_light() +
theme(legend.position = "none", axis.title.y = element_blank()) +
labs(title = expression("Non-profit funding")) + #~bold(A.)~' Type and subject'
coord_flip() +
geom_col(aes(fill = Nonprofit_funding), width = 0.7) +
scale_y_continuous(name = "Article count")
Industry funding acknowledged
Whether funding from industry (including government-industry collaborations) is acknowledged for individual reviews.
mdata %>% count(Industry_funding) %>% arrange(desc(n)) %>% filter(Industry_funding != "NA") -> t_Funding_industry #filter out NA
ggplot(t_Funding_industry, aes(x = Industry_funding, y = n)) +
theme_light() +
theme(legend.position = "none", axis.title.y = element_blank()) +
labs(title = expression("Funding from industry")) + #~bold(A.)~' Type and subject'
coord_flip() +
geom_col(aes(fill = Industry_funding), width = 0.7) +
scale_y_continuous(name = "Article count")
Data availability
Whether extracted data is made available for individual reviews.
#str(mdata)
mdata %>% count(Raw_data) %>% arrange(desc(n)) %>% filter(Raw_data != "NA") -> t_Raw_data #filter out NA
ggplot(t_Raw_data, aes(x = Raw_data, y = n)) +
theme_light() +
theme(legend.position = "none", axis.title.y = element_blank()) +
labs(title = expression("Data availability")) + #~bold(A.)~' Type and subject'
coord_flip() +
geom_col(aes(fill = Raw_data), width = 0.7) +
scale_y_continuous(name = "Article count")
#ggsave(here("plots","figure_data_availability.pdf"), width = 4, height = 3, units = "cm", scale = 2, device = cairo_pdf)
Code availability
Whether code for analyses is made available for individual reviews.
mdata %>% count(Analysis_code) %>% arrange(desc(n)) %>% filter(Analysis_code != "NA") -> t_Analysis_code #filter out NA
ggplot(t_Analysis_code, aes(x = Analysis_code, y = n)) +
theme_light() +
theme(legend.position = "none", axis.title.y = element_blank()) +
labs(title = expression("Analysis code availability")) + #~bold(A.)~' Type and subject'
coord_flip() +
geom_col(aes(fill = Analysis_code), width = 0.7) +
scale_y_continuous(name = "Article count")
Objective 3. Bibliometrics: How is synthesized PFAS evidence connected?
Examine which countries and institutions are mostly involved in secondary PFAS research. Visualizing networks between these institutions.
Basic summaries from bibliographic records
results1 <- biblioAnalysis(bib_sco) #run basic standard descriptive analysis of the dataset (data frame)
summary(results1, k = 10, pause = F, width = 130) #produces a sequence of standard summary tables displayed in the console
##
##
## MAIN INFORMATION ABOUT DATA
##
## Timespan 2013 : 2021
## Sources (Journals, Books, etc) 8
## Documents 8
## Average years from publication 3.62
## Average citations per documents 55.88
## Average citations per year per doc 10.25
## References 570
##
## DOCUMENT TYPES
## article 3
## review 5
##
## DOCUMENT CONTENTS
## Keywords Plus (ID) 309
## Author's Keywords (DE) 42
##
## AUTHORS
## Authors 56
## Author Appearances 60
## Authors of single-authored documents 0
## Authors of multi-authored documents 56
##
## AUTHORS COLLABORATION
## Single-authored documents 0
## Documents per Author 0.143
## Authors per Document 7
## Co-Authors per Documents 7.5
## Collaboration Index 7
##
##
## Annual Scientific Production
##
## Year Articles
## 2013 1
## 2016 2
## 2017 2
## 2019 1
## 2020 1
## 2021 1
##
## Annual Percentage Growth Rate 0
##
##
## Most Productive Authors
##
## Authors Articles Authors Articles Fractionalized
## 1 BONDE JPE 2 COFFMAN E 0.333
## 2 FLACHS EM 2 HINES EP 0.333
## 3 HRVIG KK 2 RAPPAZZO KM 0.333
## 4 TOFT G 2 BONDE JPE 0.292
## 5 ANDERSEN T 1 ANDERSEN T 0.250
## 6 BACH CC 1 BORG K 0.250
## 7 BALLESTER F 1 HITCHCOCK DJ 0.250
## 8 BALLESTEROS V 1 VARPE 0.250
## 9 BENFENATI E 1 TOFT G 0.238
## 10 BONDE JP 1 FLACHS EM 0.196
##
##
## Top manuscripts per citations
##
## Paper DOI TC TCperYear NTC
## 1 BONDE JP, 2016, HUM REPROD UPDATE 10.1093/HUMUPD/DMW036 131 21.83 1.394
## 2 RAPPAZZO KM, 2017, INT J ENVIRON RES PUBLIC HEALTH 10.3390/ijerph14070691 120 24.00 1.081
## 3 BALLESTEROS V, 2017, ENVIRON INT 10.1016/j.envint.2016.10.015 102 20.40 0.919
## 4 BACH CC, 2016, CRIT REV TOXICOL 10.1080/10408444.2016.1182117 57 9.50 0.606
## 5 DUCHARME NA, 2013, REPROD TOXICOL 10.1016/j.reprotox.2013.06.070 28 3.11 1.000
## 6 HITCHCOCK DJ, 2019, ENVIRON SCI TECHNOL 10.1021/acs.est.9b01296 8 2.67 1.000
## 7 PETERSEN KU, 2020, J TOXICOL ENVIRON HEALTH PART B CRIT REV 10.1080/10937404.2020.1798315 1 0.50 1.000
## 8 GAO X, 2021, ENVIRON RES 10.1016/j.envres.2021.111632 0 0.00 NaN
## 9 NA NA NA NA NA
## 10 NA NA NA NA NA
##
##
## Corresponding Author's Countries
##
## Country Articles Freq SCP MCP MCP_Ratio
## 1 DENMARK 3 0.375 2 1 0.333
## 2 USA 2 0.250 1 1 0.500
## 3 CHINA 1 0.125 1 0 0.000
## 4 NORWAY 1 0.125 1 0 0.000
## 5 SPAIN 1 0.125 0 1 1.000
##
##
## SCP: Single Country Publications
##
## MCP: Multiple Country Publications
##
##
## Total Citations per Country
##
## Country Total Citations Average Article Citations
## 1 DENMARK 189 63
## 2 USA 148 74
## 3 SPAIN 102 102
## 4 NORWAY 8 8
## 5 CHINA 0 0
##
##
## Most Relevant Sources
##
## Sources Articles
## 1 CRITICAL REVIEWS IN TOXICOLOGY 1
## 2 ENVIRONMENT INTERNATIONAL 1
## 3 ENVIRONMENTAL RESEARCH 1
## 4 ENVIRONMENTAL SCIENCE AND TECHNOLOGY 1
## 5 HUMAN REPRODUCTION UPDATE 1
## 6 INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 1
## 7 JOURNAL OF TOXICOLOGY AND ENVIRONMENTAL HEALTH - PART B: CRITICAL REVIEWS 1
## 8 REPRODUCTIVE TOXICOLOGY 1
##
##
## Most Relevant Keywords
##
## Author Keywords (DE) Articles Keywords-Plus (ID) Articles
## 1 PERFLUOROOCTANE SULFONATE (PFOS) 2 MALE 11
## 2 SEMEN QUALITY 2 ENVIRONMENTAL EXPOSURE 10
## 3 ADVERSE BIRTH OUTCOME 1 FEMALE 10
## 4 ADVERSE PREGNANCY OUTCOME 1 PERFLUOROOCTANESULFONIC ACID 10
## 5 CHILDRENS HEALTH 1 PERFLUOROOCTANOIC ACID 10
## 6 CRYPTORCHIDISM 1 HUMAN 7
## 7 DEVELOPMENTAL TOXICITY 1 PREGNANCY 7
## 8 ENDOCRINE DISRUPTION 1 HUMANS 6
## 9 EPIDEMIOLOGY 1 POLLUTANT 6
## 10 FECUNDABILITY 1 ADOLESCENT 5
plot(x = results1, k = 10, pause = F) #produces a sequence of standard summary plots displayed in the plots window
#for individual plots assign the above to an object (p <- ) and then call individual plots (1-5, e.g. p[5])
# Publication journals
knitr::kable(table(bib_sco$JI), caption = "Table SX. counts of numbers of publications per journal")
Table SX. counts of numbers of publications per journal
CRIT. REV. TOXICOL. |
1 |
ENVIRON. INT. |
1 |
ENVIRON. RES. |
1 |
ENVIRON. SCI. TECHNOL. |
1 |
HUM. REPROD. UPDATE |
1 |
INT. J. ENVIRON. RES. PUBLIC HEALTH |
1 |
J. TOXICOL. ENVIRON. HEALTH PART B CRIT. REV. |
1 |
REPROD. TOXICOL. |
1 |
Making keyword co-occurrence matrix
Thematic map based on keywords
#pdf(file="./plots/figure_thematic_map.pdf", width = 5, height = 5, pointsize = 8, family = "Helvetica", fonts = c("Helvetica"))
par(mfrow=c(1,1), mar=c(0,2,0,2))
map_thematic <- thematicMap(bib_sco, field = "ID", n = 1000, minfreq = 5, stemming = FALSE, size = 0.5, n.labels = 1, repel = TRUE)
plot(map_thematic$map)
#dev.off()
Author collaboration network
Links between review authors based on their co-authorship on these reviews. Currently, there are too few articles included for the connections between the clusters (publications) to be visible, although it looks like one set of authors co-authored two of the eight reviews from the pilot set.
NetMatrix_authors <- biblioNetwork(bib_sco, analysis = "collaboration", network = "authors", sep = ";")
NetMatrix_authors_plot <- networkPlot(NetMatrix_authors, n = 50,
Title = "Author collaboration",
type = "auto", size = 2, size.cex = TRUE,
edgesize = 10, labelsize = 1.1) #note there are potentially some mistakes in authors initials
Country collaboration network
Links between countries, based on the institutional affiliations of the review authors.
#pdf(file="./plots/figure_country_collaboration_network.pdf",width = 4, height = 4, pointsize = 12, family = "Helvetica", fonts = c("Helvetica"))
#par(mfrow=c(1,1), mar=c(2,3,2,3))
bib_sco2 <- metaTagExtraction(bib_sco, Field = "AU_CO", sep = ";") #we need to extract countries from the affiliations first
#bib_sco2$AU_CO[1:10]
NetMatrix_country <- biblioNetwork(bib_sco2, analysis = "collaboration", network = "countries", sep = ";")
NetMatrix_country_plot <- networkPlot(NetMatrix_country, n = 50,
Title = "Country collaboration", type = "auto", size=TRUE,
remove.multiple=FALSE, labelsize=1.5)
#dev.off()
Analyzing lists of references (cited works)
# analyzing lists of references
CR <- citations(bib_sco, field = "article", sep = ";") #list of most cited articles
cbind(CR$Cited[1:10]) #ten most cited articles
## [,1]
## AN UPDATED ANALYSIS FROM THE DANISH NATIONAL BIRTH COHORT (2018) INT. J. ENVIRON. RES. PUBL. HEALTH, 15 2
## MENG, Q., INOUE, K., RITZ, B., OLSEN, J., LIEW, Z., PRENATAL EXPOSURE TO PERFLUOROALKYL SUBSTANCES AND BIRTH OUTCOMES 2
## VESTED, A., RAMLAU-HANSEN, C.H., OLSEN, S.F., BONDE, J.P., KRISTENSEN, S.L., HALLDORSSON, T.I., BECHER, G., TOFT, G., ASSOCIATIONS OF IN UTERO EXPOSURE TO PERFLUORINATED ALKYL ACIDS WITH HUMAN SEMEN QUALITY AND REPRODUCTIVE HORMONES IN ADULT MEN (2013) ENVIRON HEALTH PERSPECT, 121, PP. 453-458 2
## (2002) GLOBAL ASSESSMENT OF THE STATE-OF-THE-SCIENCE OF ENDOCRINE DISRUPTERS, PP. 1-133. , DAMSTRA T, BARLOW S, BERGMAN A, KAVLOCK R, VAN DER KRAAK G (EDS). GENEVA: WORLD HEALTH ORGANISATION 1
## (2011) ESTIMATION PROGRAMS INTERFACE SUITE FOR MICROSOFT WINDOWS, , USEPA UNITED STATES ENVIRONMENTAL PROTECTION AGENCY, WASHINGTON, DC 1
## (2012) STATE OF THE SCIENCE OF ENDOCRINE DISRUPTING CHEMICALS, PP. 1-296. , BERGMAN , HEINDEL JJ, JOBLING S, KIDD KA, ZOELLER RT (EDS). GENEVA: WHO-UNEP 1
## (2013) GLOBAL MERCURY ASSESSMENT 2013: SOURCES, EMISSIONS, RELEASES AND ENVIRONMENTAL TRANSPORT, , UNEP. UNITED NATIONS ENVIRONMENT PROGRAMME CHEMICALS BRANCH: GENEVA, SWITZERLAND 1
## (2013) STATE OF THE SCIENCE OF ENDOCRINE DISRUPTING CHEMICALS - 2012, , SWITZERLAND: UNITED NATIONS ENVIRONMENT PROGRAMME (UNEP) AND THE WORLD HEALTH ORGANIZATION (WHO):. GENEVA 1
## (2013) STOCKHOLM CONVENTION ON PERSISTENT ORGANIC POLLUTANTS, , UNEP. UNITED NATIONS ENVIRONMENT PROGRAMME: GENEVA, SWITZERLAND 1
## (2017) SOME CHEMICALS USED AS SOLVENTS AND IN POLYMER MANUFACTURE, , LYON, FRANCE: INTERNATIONAL AGENCY FOR RESEARCH ON CANCER (IARC) WORLD HEALTH ORGANIZATION (WHO), IARC MONOGRAPHS ON THE EVALUATION OF CARCINOGENIC RISKS TO HUMANS VOLUME 110 1
CR <- citations(bib_sco, field = "author", sep = ";")
cbind(CR$Cited[1:10]) #ten most cited authors
## [,1]
## TOFT G 20
## BONDE J P 19
## CALAFAT A M 17
## LINDH C H 17
## LOPEZ ESPINOSA M J 15
## OLSEN J 15
## SKAKKEBAEK N E 15
## FLETCHER T 14
## HAUG L S 14
## OLSEN G W 14
Article coupling network: Citation network
# classical article coupling network: Citation network
NetMatrix_coupling <- biblioNetwork(bib_sco, analysis = "coupling", network = "references", sep = "; ")
NetMatrix_coupling_plot <- networkPlot(NetMatrix_coupling, n = 20,
Title = "Paper co-citation", type = "fruchterman", size=T,
remove.multiple=FALSE, labelsize=0.8,edgesize = 5) #plot the network
Historical co-citation network
Currently not enough papers to create this network plot.
histResults <- histNetwork(bib_sco, sep = ";")
##
## SCOPUS DB: Searching local citations (LCS) by document titles (TI) and DOIs...
##
## Found 2 documents with no empty Local Citations (LCS)
# not possible now - Found 2 documents with no empty Local Citations (LCS)
# pdf(file="./plots/Figure_hstorical_network.pdf", width=8, height=8, pointsize=10)
# par(mfrow=c(1,1), mar=c(0,0,0,0))
#histPlot(histResults, size=TRUE, arrowsize = 0.2)
# dev.off()
Cross-objective insights
We will investigate how review type, indicators of review quality or transparency are related other review properties, such as publication date, main review topic and subject, etc. Below, we provide a few examples where data belonging to different objectives is use together to answer these additional questions.
# Review_type_claimed vs. Human_animal_environment contingency table
#table(mdata$Review_type_claimed) #needs cleaning
mdata$Review_type_claimed <- tolower(mdata$Review_type_claimed)
mdata %>% count(Human_animal_environment, Review_type_claimed) -> t1
# barplot of claimed review type vs. subject
ggplot(t1, aes(x = Review_type_claimed, y = n)) +
labs(title = expression("Review type and subject")) + #~bold(A.)~' Type and subject'
coord_flip() +
scale_fill_brewer() +
geom_col(aes(fill = Human_animal_environment), width = 0.7) +
theme_light() +
theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"),
legend.title = element_blank(), legend.text=element_text(size = 4),
axis.title.x = element_text(size = 10), axis.title.y = element_blank())
#ggsave(here("plots","figure_barplot_type_subject.pdf"), width = 4, height = 4, units = "cm", scale = 2, device = cairo_pdf)
## Code for other simple plots with two crossed variables (not run for pilot data)
# Barplot example Review_type_claimed vs. Human_animal_environment contingency table
mdata %>% count(Human_animal_environment, Review_type_claimed) -> t1
ggplot(t1, aes(x = Human_animal_environment, y = n)) +
coord_flip() +
theme(legend.position = "top") +
geom_col(aes(fill = Review_type_claimed), width = 0.7)
ggplot(t1, aes(x = Review_type_claimed, y = n)) +
coord_flip() +
theme(legend.position = "top") +
geom_col(aes(fill = Human_animal_environment), width = 0.7)
# Heatmap example (it will look better with more data points)
t2 <- xtabs(~Review_type_claimed+Human_animal_environment, data = mdata) #contingency table
#heatmap(t2, Colv = NA, Rowv = NA, scale="column", col=cm.colors(256), margins = c(15, 25)) #heatmap
heatmap(t2, Colv = NA, Rowv = NA, scale="column", col=heat.colors(12, rev=TRUE, alpha = 0.6), margins = c(15, 25)) #heatmap
#different vesion with side colour bars
#rc <- rainbow(nrow(t2), start = 0, end = .3)
#cc <- rainbow(ncol(t2), start = 0, end = .3)
#heatmap(t2, Colv = NA, Rowv = NA, scale="column", RowSideColors = rc, ColSideColors = cc, margins = c(15, 25)) #heatmap