Supplementary Material - pilot data processing and visualization

PFAS exposure of humans, animals and the environment: protocol of an evidence review map and bibliometric analysis

In this Rmarkdown document we provide preliminary workflow and code for:

  • Objective 1. Mapping: What evidence on PFAS has been synthesized? We aim to reveal what areas in the realm of PFAS health-related and environmental research have been synthesized the most and where syntheses of evidence are lacking.

  • Objective 2. Critical appraisal: How robust are syntheses of PFAS evidence? We will examine the included syntheses for their reporting quality and potential biases, in order to assess reliability of reviews’ conclusions, reveal syntheses that should be re-done, and highlight the aspects of review methodology that need to be improved.

  • Objective 3. Bibliometrics: How is synthesized PFAS evidence connected? We will examine which countries and institutions are mostly involved in secondary PFAS research and what do the networks between these institutions look like.

  • Cross-Objective Insights: We will investigate how review type, indicators of review quality or transparency are related other review properties, such as publication date, main review topic and subject, etc.

Note: The code and all created outputs are based on a pilot set of data extracted from only 8 papers that fulfilled our inclusion criteria during the piloting stage. The code will need to be adjusted and expanded for the final data set (e.g., additional data cleaning steps or adjustments to the plot styling and contents). These changes will be documented via tracking on GitHub.

R code processing messages and warnings are suppressed from the output document.

Setup

Set global code chunk parameters and load packages.

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.4     ✓ purrr   0.3.4
## ✓ tibble  3.1.2     ✓ dplyr   1.0.5
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(here)
## here() starts at /Users/itchyshin/Dropbox/Local_git_repos/UmbrellaReview_PFAS_pilot
library(stringr)
library(knitr)
library(formatR)
library(forcats)
library(ggplot2)
library(hrbrthemes) #for ggplot2
library(patchwork) #for ggplot2
library(bibliometrix)
## To cite bibliometrix in publications, please use:
## 
## Aria, M. & Cuccurullo, C. (2017) bibliometrix: An R-tool for comprehensive science mapping analysis, 
##                                  Journal of Informetrics, 11(4), pp 959-975, Elsevier.
##                         
## 
## https://www.bibliometrix.org
## 
##                         
## For information and bug reports:
##                         - Send an email to info@bibliometrix.org   
##                         - Write a post on https://github.com/massimoaria/bibliometrix/issues
##                         
## Help us to keep Bibliometrix free to download and use by contributing with a small donation to support our research team (https://bibliometrix.org/donate.html)
## 
##                         
## To start with the shiny web-interface, please digit:
## biblioshiny()
library(igraph)
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union
## The following objects are masked from 'package:purrr':
## 
##     compose, simplify
## The following object is masked from 'package:tidyr':
## 
##     crossing
## The following object is masked from 'package:tibble':
## 
##     as_data_frame
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)

Load pilot data

Manually extracted pilot data is stored in four separate .csv files representing different aspects of the data (extracted via structured predefined Google Forms - one per table). Additional fixed table stores pre-extracted data on PFAS types included in this map.

Bibliographic data records are exported from Scopus (including cited references field) in .bib format and locally saved as scopus.bib. the pilot version of this file only includes bibliographic information of ‘systeamtic review family’-like reviews included in Objectives 1 and 2. In the actual work bibliographic information will be expanded to all reviews on PFAS exposure in humans, animals and the environment.

The study identity (authors, title) has been obscured in the pilot data extraction file. It will be public in the final data extraction files.

# Data from .csv file - main data sheet
mdata <- read_csv(here("data","ReviewMap_PFAS_main_pilot.csv"), skip = 0)
#dim(mdata) #8 rows 34 columns

# Data from .csv file - Species_info sheet
spdata <- read_csv(here("data","ReviewMap_PFAS_species_pilot.csv"), skip = 0) 
#dim(spdata) #2 rows 4 columns
# change to long format (one species per row - one or multple rows per study):
spdata %>% select(-X4) %>% separate_rows(Species_scientific_name, sep='; ') -> spdata
#dim(spdata) #29 rows 3 columns

# Data from .csv file - PFAS_types sheet:
ptdata <- read_csv(here("data","ReviewMap_PFAS_types_pilot.csv"), skip = 0) 
#dim(ptdata) #8 rows 4 columns
# change to long format (one type per row - one or multple rows per study):
ptdata %>% select(-X4) %>% separate_rows(PFAS_type, sep=', ') -> ptdata
#dim(ptdata) #70 rows 3 columns

# Data from .csv file - PFAS_info sheet:
pidata <- read_csv(here("data","PFAS_info.csv"), skip = 0) 
#dim(pidata) #35 rows 7 columns

# Critical apprisal(AMSTAR2) from .csv file:
qdata <- read_csv(here("data", "ReviewMap_PFAS_quality_pilot.csv"), skip = 0)
#dim(qdata) #11 rows 38 columns
qdata %>% select(-X34) -> qdata #remove empty column

# Bibliometric data of included papers (with references) dwonloaded from Scopus in .bib format:
bib_sco <- convert2df(here("data","scopus.bib"), dbsource = "scopus", format = "bibtex")
## 
## Converting your scopus collection into a bibliographic dataframe
## 
## Done!
## 
## 
## Generating affiliation field tag AU_UN from C1:  Done!
#dim(bib_sco) #8 rows 33 columns

Merge data stored across tables

The data stored across multiple flat spreadsheets needs merged for some of the analyses.

## Prepare - remove columns starting with "Checked" 
# (info on data checking - these fields will be present in the final extraction sheet), 
# also remove any other redundand columns:
# mdata %>% select(-starts_with("Checked")) -> mdata
# spdata %>% select(-starts_with("Checked")) -> spdata
# ptdata %>% select(-starts_with("Checked")) -> ptdata
# pidata %>% select(-starts_with("Checked")) -> pidata

## Merge main data with species info
mspdata <- left_join(mdata, spdata, by = "Study_ID")
#dim(mspdata) #8 rows 37 columns
#names(mspdata)
#str(mspdata)
#View(mspdata)

# Merge PFAS types data with PFAS info and then with main data
ptidata <- left_join(ptdata, pidata, by = "PFAS_type")
#dim(ptidata) #8 rows 10 columns
#names(ptidata)
#str(ptidata)
#View(ptidata)

mpidata <- left_join(mdata, ptidata, by = "Study_ID")
#dim(mpidata) #8 rows 43 columns
#names(mpidata)
#str(mpidata)
#View(mpidata)

Make a simple summary table of included papers

Note: authors names and paper titles are blinded for the protocol.

knitr::kable(select(mdata, Study_ID,    Author_year,    Paper_title), caption = "Table of included studies")
Table of included studies
Study_ID Author_year Paper_title
S_001 Authors1_2017 Title1
S_002 Authors2_2013 Title2
S_003 Authors3_2019 Title3
S_004 Authors4_2021 Title4
S_005 Authors5_2017 Title5
S_006 Authors6_2017 Title6
S_007 Authors7_2016 Title7
S_008 Authors8_2020 Title8

Objective 1. Mapping: What evidence on PFAS has been synthesized?

Here we aim to reveal what areas in the realm of PFAS health-related and environmental research have been synthesized the most and where syntheses of evidence are lacking. Below, we include examples of potential data summaries and visualizations that will form the evidence review map. The pilot dataset presented here consists of 8 review studies.

Focus on PFAS, or POPs, and how many PFAS

#str(mdata)
mdata %>% count(PFAS_focus, PFAS_one_many) %>% arrange(desc(n)) %>% 
  filter(PFAS_focus != "NA") -> t_PFASfocus #filter out NA

ggplot(t_PFASfocus, aes(x = PFAS_focus, y = n)) +
 theme_light() +
 labs(title = expression("PFAS focus")) + #~bold(A.)~' Type and subject'
 coord_flip() +
 scale_fill_brewer() +
 guides(fill=guide_legend(title="PFAS:")) + 
 geom_col(aes(fill = PFAS_one_many), width = 0.7) + 
 scale_y_continuous(name = "Article count") +
 theme_light() +
 theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"), 
       legend.title = element_text(size=10),  legend.text=element_text(size = 10), 
       axis.title.x = element_text(size = 10), axis.title.y = element_blank())

#ggsave(here("plots","figure_PFAS_focus.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)

PFAS types

#str(ptidata)
ptidata %>% count(PFAS_type) %>% arrange(desc(n)) %>% filter(PFAS_type != "NA") -> t_PFAStype #filter out NA
t_PFAStype$PFAS_type <- factor(t_PFAStype$PFAS_type, levels = t_PFAStype$PFAS_type[order(t_PFAStype$n, decreasing = FALSE)])
PFAS_levels_order <- t_PFAStype$PFAS_type[order(t_PFAStype$n, decreasing = FALSE)] #save for another plot

# simple barplot with PFAS types
ggplot(t_PFAStype, aes(x = PFAS_type, y = n)) +
  geom_col(aes(fill = ""), width = 0.7) +
  theme_light() +
  labs(title = expression("PFAS types reviewed")) + #~bold(A.)~' Type and subject'
  coord_flip()+
  scale_y_continuous(name = "Article count") +
  scale_fill_manual(values = c("#919191")) +
  theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())

#ggsave(here("plots","figure_PFAS_types.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)

Main review topics (MeSH terms)

Each records usually has multiple associated MeSH terms, which are separated with a semicolon (or a semicolon with a newline symbol).
The main topics of the article are denoted by the MeSH terms with asterisks *, but some records will not have main topics, so its safer to ignore this information.
Some MeSH headings also have qualifiers (sometimes called subheadings) - these are shown after a forward-slash and are used to “describe the specific aspects of the MeSH heading that are pertinent to the article” (https://www.nlm.nih.gov/bsd/indexing/training/SUB_010.html). MeSH terms and qualifiers could be analysed separately.

# change to long format (one species per row - one or multple rows per study):
mdata %>% select(MeSH_terms) %>% separate_rows(MeSH_terms, sep = ";\\\n") %>% separate_rows(MeSH_terms, sep = "; ") -> t_MESH
#bring to lower case and remove ending and trailing whitespace from character strings, single character vector:
tmesh <- str_to_lower(trimws(t_MESH$MeSH_terms)) 
#remove star character:
tmesh <- sub("[\\*]", "", tmesh) 

#separate terms before and after "/" into separate columns (headings and quantifiers, respectively)
tmesh2 <- str_split_fixed(tmesh, " / ", n = 2)

#create a MeSH dataframe with three columns: terms, headings, qualifiers
t_MESH2 <- bind_cols(tmesh, tmesh2[,1], tmesh2[,2])
colnames(t_MESH2)  <- c("MeSH_terms", "MeSH_headings", "MeSH_qualifiers")
#count frequencies of headings
t_MESH2 %>% select(MeSH_headings) %>% count(MeSH_headings) %>% arrange(desc(n)) %>% 
  filter(MeSH_headings != "NA") ->  MeSH_headings_counts 
#str(MeSH_headings_counts)

MeSH_headings_counts$MeSH_headings <- factor(MeSH_headings_counts$MeSH_headings, levels = MeSH_headings_counts$MeSH_headings[order(MeSH_headings_counts$n, decreasing = FALSE)])
MESH_headings_order <- MeSH_headings_counts$MeSH_headings[order(MeSH_headings_counts$n, decreasing = FALSE)] #save for the next plot

# simple barplot with top 10 MESH headings
ggplot(MeSH_headings_counts[1:10, ], aes(x = MeSH_headings, y = n)) +
  geom_col(aes(fill = ""), width = 0.7) +
  theme_light() +
  labs(title = expression("Top 10 MeSH headings reviewed")) + #~bold(A.)~' Type and subject'
  coord_flip() +
  scale_y_continuous(name = "Article count") +
  scale_fill_manual(values = c("#919191")) +
  theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())

#ggsave(here("plots","figure_MeSH_headings.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)
#count frequencies of qualifiers
t_MESH2 %>% select(MeSH_qualifiers) %>% count(MeSH_qualifiers) %>% arrange(desc(n)) %>% filter(MeSH_qualifiers != "NA") %>% filter(MeSH_qualifiers != "")->  MeSH_qualifiers_counts 
#str(MeSH_qualifiers_counts)

MeSH_qualifiers_counts$MeSH_qualifiers <- factor(MeSH_qualifiers_counts$MeSH_qualifiers, levels = MeSH_qualifiers_counts$MeSH_qualifiers[order(MeSH_qualifiers_counts$n, decreasing = FALSE)])
MESH_qualifiers_order <- MeSH_qualifiers_counts$MeSH_qualifiers[order(MeSH_qualifiers_counts$n, decreasing = FALSE)] #save for the next plot

# simple barplot with MESH qualifiers
ggplot(MeSH_qualifiers_counts[, ], aes(x = MeSH_qualifiers, y = n)) +
  geom_col(aes(fill = ""), width = 0.7) +
  theme_light() +
  labs(title = expression("MeSH qualifiers reviewed")) + #~bold(A.)~' Type and subject'
  coord_flip() +
  scale_y_continuous(name = "Article count") +
  scale_fill_manual(values = c("#919191")) +
  theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())

#ggsave(here("plots","figure_MeSH_qualifiers.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)

Main review subjects

#str(mdata)
mdata %>% count(Human_animal_environment) %>% arrange(desc(n)) %>% 
  filter(Human_animal_environment != "NA") %>% filter(Human_animal_environment != "") -> t_subject #filter out NA
t_subject$Human_animal_environment <- factor(t_subject$Human_animal_environment, levels = unique(t_subject$Human_animal_environment[order(t_subject$n, decreasing = FALSE)]))
subjects_levels_order <- t_subject$Human_animal_environment[order(t_subject$n, decreasing = FALSE)] #save for another plot

# simple barplot with MESH_heading
ggplot(t_subject, aes(x = Human_animal_environment, y = n)) +
  geom_col(aes(fill = ""), width = 0.7) +
  theme_light() +
  labs(title = expression("Main subjects reviewed")) + #~bold(A.)~' Type and subject'
  coord_flip()+
  scale_y_continuous(name = "Article count") +
  scale_fill_manual(values = c("#919191")) +
  theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())

#ggsave(here("plots","figure_subjects.pdf"), width = 4, height = 2, units = "cm", scale = 2, device = cairo_pdf)

Combining MeSH qualifiers with main review subjects

#str(mdata)
#subset main dat by main subject
#table(mdata$Human_animal_environment)

# change to long format (one species per row - one or multple rows per study):
mdata %>% group_by(Human_animal_environment) %>% select(MeSH_terms) %>% 
  separate_rows(MeSH_terms, sep = ";\\\n") %>% separate_rows(MeSH_terms, sep = "; ") -> t_MESHby
#str(t_MESHby)

tmeshby <- str_to_lower(trimws(t_MESHby$MeSH_terms)) #bring to lower case and remove ending and trailing whitespace from character strings, single character vector
tmeshby <- sub("[\\*]", "", tmeshby) #remove star character

#separate terms before and after "/" into separate columns (headings and quantifiers, respectively)
tmeshby2 <- str_split_fixed(tmeshby, " / ", n = 2)

#create a MeSH dataframe with three columns: terms, headings, qualifiers
t_MESHby2 <- bind_cols(t_MESHby$Human_animal_environment, tmeshby, tmeshby2[,1], tmeshby2[,2])
colnames(t_MESHby2)  <- c("Subject", "MeSH_terms", "MeSH_headings", "MeSH_qualifiers")
#str(t_MESHby2)

t_MESHby2 %>% count(MeSH_qualifiers, Subject) %>% arrange(desc(n)) %>% filter(MeSH_qualifiers != "NA") %>% 
  filter(MeSH_qualifiers != "") -> t_topic_subject #filter out NA

t_topic_subject$Subject <- factor(t_topic_subject$Subject, levels = subjects_levels_order) #use order from the previous graph
t_topic_subject$MeSH_qualifiers <- factor(t_topic_subject$MeSH_qualifiers, levels = MESH_qualifiers_order) #use order from the previous graph
#str(t_topic_subject)

ggplot(t_topic_subject, aes(x = MeSH_qualifiers, y = n)) +
 theme_light() +
 labs(title = expression("MeSH qualifiers and subjects reviewed")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 scale_fill_brewer() +
 geom_col(aes(fill = Subject), width = 0.7) + 
 scale_y_continuous(name = "Article count") +
 theme_light() +
 theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"), legend.title = element_blank(),  legend.text=element_text(size = 10), axis.title.x = element_text(size = 10), axis.title.y = element_blank()) 

#ggsave(here("plots","figure_MeSH_qualifiers_subjects.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)

PFAS focus by main review subject

# PFAS_one_many vs. Human_animal_environment contingency table
mdata %>% count(Human_animal_environment, PFAS_one_many) -> t1

ggplot(t1, aes(x = PFAS_one_many , y = n)) +
 coord_flip()  +
 geom_col(aes(fill = Human_animal_environment), width = 0.7) +
 theme_light() +  
 theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black")) +
 ylab("Article count") +
 xlab("PFAS") +
 guides(fill=guide_legend(title="Subject:"))

#ggsave(here("plots","figure_PFAS_focus_subjects.pdf"), width = 4, height = 2, units = "cm", scale = 2, device = cairo_pdf)

PFAS types by main review subject

#str(mpidata)
mpidata %>% count(PFAS_type, Human_animal_environment) %>% arrange(desc(n)) %>% filter(PFAS_type != "NA") -> t_PFAStype #filter out NA
#t_PFAStype$PFAS_type <- factor(t_PFAStype$PFAS_type, levels = unique(t_PFAStype$PFAS_type[order(t_PFAStype$n, decreasing = FALSE)]))
t_PFAStype$PFAS_type <- factor(t_PFAStype$PFAS_type, levels = PFAS_levels_order) #use order from the previous graph


p_PFAStype <- ggplot(t_PFAStype, aes(x = PFAS_type, y = n)) +
 theme_light() +
 labs(title = expression("PFAS type and main subjects reviewed")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 scale_fill_brewer() +
 geom_col(aes(fill = Human_animal_environment), width = 0.7) + 
 scale_y_continuous(name = "Article count") +
 theme_light() +
 theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"), legend.title = element_blank(),  legend.text=element_text(size = 10), axis.title.x = element_text(size = 10), axis.title.y = element_blank()) +
  theme(legend.position = c(0.8, 0.5), legend.background = element_rect(fill = "white", color = "black")) #lace legend inside plot area
#theme(legend.position = "none", axis.title.x = element_text(size = 10), axis.title.y = element_blank())
p_PFAStype

#ggsave(here("plots","figure_PFAStype_subject.pdf"), width = 6, height = 8, units = "cm", scale = 2, device = cairo_pdf)

How many species and which ones

#str(mspdata)
mspdata %>% count(Species_scientific_name, Species_one_many) %>% arrange(desc(n)) %>% filter(Species_one_many != "NA") -> t_spfocus #filter out NA

ggplot(t_spfocus, aes(x = Species_scientific_name, y = n)) +
 theme_light() +
 labs(title = expression("Species focus")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 scale_fill_brewer() +
 guides(fill=guide_legend(title="Species:")) + 
 geom_col(aes(fill = Species_one_many), width = 0.7) + 
 scale_y_continuous(name = "Article count") +
 theme_light() +
 theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"), legend.title = element_text(size=10),  legend.text=element_text(size = 10), axis.title.x = element_text(size = 10), axis.title.y = element_blank()) #+

#ggsave(here("plots","figure_species_focus.pdf"), width = 4, height = 6, units = "cm", scale = 2, device = cairo_pdf)
mdata %>% count(Species_one_many, PFAS_one_many) -> t2 #make table of counts across two columns 
t2$Species_one_many <- as.factor(t2$Species_one_many)
levels(t2$Species_one_many) <- c("Many", "One") #reorder levels for Species_one_many
t2 %>% filter(Species_one_many != "NA") -> t2 #filter out NA

# simple dot plot
ggplot(data = t2, aes(Species_one_many, PFAS_one_many)) + geom_point(aes(size = n), colour = "darkgrey") +
theme_classic() +
  theme(axis.ticks.x=element_blank(), axis.text.x=element_text(size = 20, angle = 90, hjust = 0)) +
  theme(axis.ticks.y=element_blank(), axis.text.y=element_text(size = 20)) +
  theme(axis.line=element_blank()) +
  theme(text = element_text(size = 22)) +
  theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black")) +
  theme(plot.margin = unit(c(10,10,10,10), "mm")) +
  scale_size_continuous(range = c(1, 10)) +
  #theme(axis.title = element_blank()) +
  scale_x_discrete(position = "top") +
  xlab("Species") +
  ylab("PFAS")

 #+ labs(title = "Counts of reviews by subject and type")

#ggsave(here("plots","figure_dotplot_PFAS_Species.pdf"), width = 4, height = 4, units = "cm", scale = 4, device = cairo_pdf)

Objective 2. Critical appraisal: How robust are syntheses of PFAS evidence?

Here we will examine the included syntheses for their reporting quality and potential biases, in order to assess reliability of reviews’ conclusions, reveal syntheses that should be re-done, and highlight the aspects of review methodology that need to be improved.

Make a table with AMSTAR2 questions

The questions were stored separately in the first row of the file, and were loaded separately. We display them in a table, alongside the short labels that can be used in the plots. Full description of the questions and coding rules (from an adapted AMSTAR2 checklist; Shea et al. 2017; Samarasinghe et al. 2019), are provided in the protocol.

questions_list <- qdata %>% select(starts_with("Q") & !ends_with("_comment")) %>% colnames()  #only select columns with assessment codes (data)

questions <- tibble(questions = questions_list, label = c("question and criteria", "a priori protocol", "included study designs", "comprehensive search", "selection duplicated", "extraction duplicated", "list of excluded studies", "summary of included studies", "critical appraisal", "sources of funding", "quantitative synthesis", "risk of bias", "effect of bias", "variability investigated", "publication bias", "conflict of interest"))
#str(questions)

knitr::kable(questions, caption = "Table. List of AMSTAR2 questions with  labels for plotting")
Table. List of AMSTAR2 questions with labels for plotting
questions label
Q1. Are the research questions and inclusion criteria for the review clearly delineated? question and criteria
Q2. Did the report of the review contain an explicit statement that the review methods were established prior to the conduct of the review and did the report justify any significant deviations from the protocol? a priori protocol
Q3. Did the review authors explain their selection of the study designs for inclusion in the review? included study designs
Q4. Did the review authors use a comprehensive literature search strategy? comprehensive search
Q5. Did the review authors perform study selection in duplicate? selection duplicated
Q6. Did the review authors perform data extraction in duplicate? extraction duplicated
Q7. Did the review authors provide a list of excluded studies and justify the exclusions? list of excluded studies
Q8. Did the review authors describe the included studies in adequate detail? summary of included studies
Q9. Did the review authors use a satisfactory technique for assessing the risk of bias (RoB) in individual studies that were included in the review? critical appraisal
Q10. Did the review authors report on the sources of funding for the studies included in the review? sources of funding
Q11. If meta-analysis was performed did the review authors use appropriate methods for statistical combination of results? quantitative synthesis
Q12. If meta-analysis was performed, did the review authors assess the potential impact of RoB in individual studies on the results of the meta-analysis or other evidence synthesis? risk of bias
Q13. Did the review authors account for RoB in individual studies when interpreting/ discussing the results of the review? effect of bias
Q14. Did the review authors provide a satisfactory explanation for, and discussion of, any heterogeneity observed in the results of the review? variability investigated
Q15. If they performed quantitative synthesis did the review authors carry out an adequate investigation of publication bias (small study bias) and discuss its likely impact on the results of the review? publication bias
Q16. Did the review authors report any potential sources of conflict of interest, including any funding they received for conducting the review? conflict of interest

Summary plot for pilot AMSTAR2 data

AMSTAR2 assessment results as percentages of each answer score per question.

## prepare data 
dim(qdata) #some empty rows to be removed
## [1] 11 33
#studies <- qdata$Author_year #normally would use this

#only select columns with assessment codes (drop comments and empty rows)
qtable <- qdata %>% filter(Study_ID!="NA") %>% select(starts_with("Q") & !ends_with("_comment"))  

#simiplify column names to Q+number format
names(qtable) <- gsub("\\..*", "", names(qtable)) 

#simplify all the answers to short strings (version for a single column: gsub(" =.*", "", qtable$Q1)):
qtable <- apply(qtable, 2, function(y) as.character(gsub(" =.*", "", y)))

#save studies in a new vector
studies <- qdata$Study_ID[!is.na(qdata$Study_ID)]

#convert to long format data frame with Study_ID
qtable_long <- data.frame(study = rep(as.factor(studies), each = length(studies)),
                     question = rep(colnames(qtable), each = length(studies)),
                     score = as.vector(qtable), stringsAsFactors = TRUE) #make long format table

rownames(qtable_long) <- NULL
qtable_long$question <- factor(qtable_long$question, levels(qtable_long$question)[rev(c(1,9:16,2:8))]) #setting the order of levels - by Q-number

#add a column with verbal expression of scores:   
qtable_long$score_word <- qtable_long$score
levels(qtable_long$score_word) <- c("No", "Partially", "Yes", "NotApplicable")


summaryplot <- ggplot(data = qtable_long) +
      geom_bar(mapping = aes(x = question, fill = score_word), width = 0.7,
               position = "fill", color = "black") +
      coord_flip(ylim = c(0, 1)) +
      guides(fill = guide_legend(reverse = TRUE)) +
      scale_fill_manual("Risk of Bias",
                        labels = c("  Partially   ", 
                                   "  No  ",
                                   "  Yes  ",
                                   "  Not Applicable  "),
                        values = c(Partially = "#E2DF07", #yellow
                                   No = "#BF0000", #red
                                   Yes = "#02C100", #green
                                   NotApplicable = "grey")) +
      scale_y_continuous(labels = scales::percent) +
      theme(axis.title.x = element_blank(),
            axis.title.y = element_blank(),
            axis.ticks.y = element_blank(),
            axis.text.x = element_text(size = 14, color = "black", hjust=0.5),
            axis.text.y = element_text(size = 14, color = "black", hjust=0),
            axis.line.x = element_line(colour = "black", size = 0.5, linetype = "solid"),
            legend.position = "top",
            panel.grid.major = element_blank(),
            panel.grid.minor = element_blank(),
            panel.background = element_blank(),
            legend.background = element_rect(linetype = "solid", colour = "grey"),
            legend.title = element_blank(),
            legend.key.size = unit(0.75, "cm"),
            legend.text = element_text(size = 14))

## display plot
summaryplot

## save plot
#ggsave(here("plots","figure_AMSTAR2_summary_v01.pdf"), width = 10, height = 8, units = "cm", scale = 2, device = cairo_pdf)

Individual scores plot for pilot AMSTAR2 data

AMSTAR2 assessment results as a score per article per question.

## data prep
dim(qdata) #some empty rows to be removed
## [1] 11 33
#save studies in a new vector
studies <- qdata$Study_ID[!is.na(qdata$Study_ID)]
#studies <- qdata$Author_year #normally would use this

#only select columns with assessment codes (drop comments and empty rows)
qtable <- qdata %>% filter(Study_ID!="NA") %>% select(starts_with("Q") & !ends_with("_comment"))  

#simiplify column names to Q+number format
names(qtable) <- gsub("\\..*", "", names(qtable)) 

#simplify all the answers to short strings (version for a single column: gsub(" =.*", "", qtable$Q1)):
qtable <- apply(qtable, 2, function(y) as.character(gsub(" =.*", "", y)))

## prepare data - change scores to verbal expressions: 
qtable <- gsub("N/A", "NotApplicable", qtable)
qtable <- gsub("0.5", "Partially", qtable)
qtable <- gsub("1", "Yes", qtable)
qtable <- gsub("0", "No", qtable)


#convert to long format data frame with Study_ID
qtable_long <- data.frame(study = as.factor(studies),
                     question = rep(colnames(qtable), each = length(studies)),
                     measurement = as.vector(qtable), stringsAsFactors=FALSE)
rownames(qtable_long) = NULL
qtable_long$question <- as.factor(qtable_long$question)
qtable_long$question <- factor(qtable_long$question, levels(qtable_long$question)[c(1,9:16,2:8)]) #setting the order of levels - by Q-number
qtable_long$study <- factor(qtable_long$study, levels = unique(studies)[rev(order(unique(qtable_long$study)))]) #Re-order by study name (alphabetically)

##Make scoreplot      
scoresplot <- ggplot(data = qtable_long, aes(y = study, x = question)) +
  geom_tile(color="black", fill="white", size = 0.8) +
  geom_point(aes(color=as.factor(measurement)), size=10) +
  #geom_text(aes(label = measurement), size = 8) +
  scale_x_discrete(position = "top") +
  scale_color_manual(values = c("Partially" = "#E2DF07",
                                "No" = "#BF0000",
                                "Yes" = "#02C100",
                                "NotApplicable" = "grey")) +
  theme_minimal() +
  coord_equal() +
        theme(axis.title.x = element_text(size = 14, color = "black", face = "italic"),
              axis.title.y = element_blank(),
              axis.ticks.y = element_blank(),
              axis.text.y = element_text(size = 15, color = "black"),
              axis.text.x = element_text(size = 14, color = "black", angle = 45, hjust=0),
              legend.position = "bottom",
              legend.background = element_rect(linetype = "solid", colour = "grey"),
              legend.title = element_blank(),
              legend.key.size = unit(0.75, "cm"),
              legend.text = element_text(size = 12),
              panel.grid.major = element_blank(),
              panel.grid.minor = element_blank(),
              panel.background = element_blank(),
              plot.margin = unit(c(1,1,1,0), "cm") #set margins for saving into pdf file
      )
  
scoresplot

## save plot
#ggsave(here("plots","figure_AMSTAR2_scores_v01.pdf"), width = 16, height = 12, units = "cm", scale = 2, device = cairo_pdf)

COI statement

Whether Conflict Of Interests statement is provided for individual reviews.

mdata %>% count(COI_statement) %>% arrange(desc(n)) %>% filter(COI_statement != "NA") -> t_COI_statement #filter out NA

ggplot(t_COI_statement, aes(x = COI_statement, y = n)) +
 theme_light() +
 theme(legend.position = "none", axis.title.y = element_blank()) +
 labs(title = expression("COI statement availability")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 geom_col(aes(fill = COI_statement), width = 0.7) + 
 scale_y_continuous(name = "Article count")

COI present

Whether potential Conflict Of Interests is acknowledged for individual reviews.

mdata %>% count(COI_present) %>% arrange(desc(n)) %>% filter(COI_present != "NA") -> t_COI_present #filter out NA

ggplot(t_COI_present, aes(x = COI_present, y = n)) +
 theme_light() +
 theme(legend.position = "none", axis.title.y = element_blank()) +
 labs(title = expression("Conflict of Interest")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 geom_col(aes(fill = COI_present), width = 0.7) + 
 scale_y_continuous(name = "Article count")

Funding statement

Whether funding statement is provided for individual reviews.

mdata %>% count(Funding_statement) %>% arrange(desc(n)) %>% filter(Funding_statement != "NA") -> t_Funding_statement #filter out NA

ggplot(t_Funding_statement, aes(x = Funding_statement, y = n)) +
 theme_light() +
 theme(legend.position = "none", axis.title.y = element_blank()) +
 labs(title = expression("Funding statement availability")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 geom_col(aes(fill = Funding_statement), width = 0.7) + 
 scale_y_continuous(name = "Article count")

Non-profit funding acknowledged

Whether funding from non-profit (NGO, government, universities, etc.) is acknowledged for individual reviews.

mdata <- mdata %>% rename(Nonprofit_funding = "Non-profit_funding") 

mdata %>% count(Nonprofit_funding) %>% arrange(desc(n)) %>% filter(Nonprofit_funding != "NA") -> t_Nonprofit_funding #filter out NA

ggplot(t_Nonprofit_funding, aes(x = Nonprofit_funding, y = n)) +
 theme_light() +
 theme(legend.position = "none", axis.title.y = element_blank()) +
 labs(title = expression("Non-profit funding")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 geom_col(aes(fill = Nonprofit_funding), width = 0.7) + 
 scale_y_continuous(name = "Article count")

Industry funding acknowledged

Whether funding from industry (including government-industry collaborations) is acknowledged for individual reviews.

mdata %>% count(Industry_funding) %>% arrange(desc(n)) %>% filter(Industry_funding != "NA") -> t_Funding_industry #filter out NA

ggplot(t_Funding_industry, aes(x = Industry_funding, y = n)) +
 theme_light() +
 theme(legend.position = "none", axis.title.y = element_blank()) +
 labs(title = expression("Funding from industry")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 geom_col(aes(fill = Industry_funding), width = 0.7) + 
 scale_y_continuous(name = "Article count")

Data availability

Whether extracted data is made available for individual reviews.

#str(mdata)
mdata %>% count(Raw_data) %>% arrange(desc(n)) %>% filter(Raw_data != "NA") -> t_Raw_data #filter out NA

ggplot(t_Raw_data, aes(x = Raw_data, y = n)) +
 theme_light() +
 theme(legend.position = "none", axis.title.y = element_blank()) +
 labs(title = expression("Data availability")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 geom_col(aes(fill = Raw_data), width = 0.7) + 
 scale_y_continuous(name = "Article count")

#ggsave(here("plots","figure_data_availability.pdf"), width = 4, height = 3, units = "cm", scale = 2, device = cairo_pdf)

Code availability

Whether code for analyses is made available for individual reviews.

mdata %>% count(Analysis_code) %>% arrange(desc(n)) %>% filter(Analysis_code != "NA") -> t_Analysis_code #filter out NA

ggplot(t_Analysis_code, aes(x = Analysis_code, y = n)) +
 theme_light() +
 theme(legend.position = "none", axis.title.y = element_blank()) +
 labs(title = expression("Analysis code availability")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 geom_col(aes(fill = Analysis_code), width = 0.7) + 
 scale_y_continuous(name = "Article count")


Objective 3. Bibliometrics: How is synthesized PFAS evidence connected?

Examine which countries and institutions are mostly involved in secondary PFAS research. Visualizing networks between these institutions.

Basic summaries from bibliographic records

results1 <- biblioAnalysis(bib_sco) #run basic standard descriptive analysis of the dataset (data frame)
summary(results1, k = 10, pause = F, width = 130) #produces a sequence of standard summary tables displayed in the console
## 
## 
## MAIN INFORMATION ABOUT DATA
## 
##  Timespan                              2013 : 2021 
##  Sources (Journals, Books, etc)        8 
##  Documents                             8 
##  Average years from publication        3.62 
##  Average citations per documents       55.88 
##  Average citations per year per doc    10.25 
##  References                            570 
##  
## DOCUMENT TYPES                     
##  article      3 
##  review       5 
##  
## DOCUMENT CONTENTS
##  Keywords Plus (ID)                    309 
##  Author's Keywords (DE)                42 
##  
## AUTHORS
##  Authors                               56 
##  Author Appearances                    60 
##  Authors of single-authored documents  0 
##  Authors of multi-authored documents   56 
##  
## AUTHORS COLLABORATION
##  Single-authored documents             0 
##  Documents per Author                  0.143 
##  Authors per Document                  7 
##  Co-Authors per Documents              7.5 
##  Collaboration Index                   7 
##  
## 
## Annual Scientific Production
## 
##  Year    Articles
##     2013        1
##     2016        2
##     2017        2
##     2019        1
##     2020        1
##     2021        1
## 
## Annual Percentage Growth Rate 0 
## 
## 
## Most Productive Authors
## 
##    Authors        Articles Authors        Articles Fractionalized
## 1   BONDE JPE            2   COFFMAN E                      0.333
## 2   FLACHS EM            2   HINES EP                       0.333
## 3   HRVIG KK             2   RAPPAZZO KM                    0.333
## 4   TOFT G               2   BONDE JPE                      0.292
## 5   ANDERSEN T           1   ANDERSEN T                     0.250
## 6   BACH CC              1   BORG K                         0.250
## 7   BALLESTER F          1   HITCHCOCK DJ                   0.250
## 8   BALLESTEROS V        1   VARPE                          0.250
## 9   BENFENATI E          1   TOFT G                         0.238
## 10  BONDE JP             1   FLACHS EM                      0.196
## 
## 
## Top manuscripts per citations
## 
##                                                 Paper                                     DOI  TC TCperYear   NTC
## 1  BONDE JP, 2016, HUM REPROD UPDATE                           10.1093/HUMUPD/DMW036          131     21.83 1.394
## 2  RAPPAZZO KM, 2017, INT J ENVIRON RES PUBLIC HEALTH          10.3390/ijerph14070691         120     24.00 1.081
## 3  BALLESTEROS V, 2017, ENVIRON INT                            10.1016/j.envint.2016.10.015   102     20.40 0.919
## 4  BACH CC, 2016, CRIT REV TOXICOL                             10.1080/10408444.2016.1182117   57      9.50 0.606
## 5  DUCHARME NA, 2013, REPROD TOXICOL                           10.1016/j.reprotox.2013.06.070  28      3.11 1.000
## 6  HITCHCOCK DJ, 2019, ENVIRON SCI TECHNOL                     10.1021/acs.est.9b01296          8      2.67 1.000
## 7  PETERSEN KU, 2020, J TOXICOL ENVIRON HEALTH PART B CRIT REV 10.1080/10937404.2020.1798315    1      0.50 1.000
## 8  GAO X, 2021, ENVIRON RES                                    10.1016/j.envres.2021.111632     0      0.00   NaN
## 9  NA                                                          NA                              NA        NA    NA
## 10 NA                                                          NA                              NA        NA    NA
## 
## 
## Corresponding Author's Countries
## 
##   Country Articles  Freq SCP MCP MCP_Ratio
## 1 DENMARK        3 0.375   2   1     0.333
## 2 USA            2 0.250   1   1     0.500
## 3 CHINA          1 0.125   1   0     0.000
## 4 NORWAY         1 0.125   1   0     0.000
## 5 SPAIN          1 0.125   0   1     1.000
## 
## 
## SCP: Single Country Publications
## 
## MCP: Multiple Country Publications
## 
## 
## Total Citations per Country
## 
##   Country      Total Citations Average Article Citations
## 1      DENMARK             189                        63
## 2      USA                 148                        74
## 3      SPAIN               102                       102
## 4      NORWAY                8                         8
## 5      CHINA                 0                         0
## 
## 
## Most Relevant Sources
## 
##                                                              Sources        Articles
## 1 CRITICAL REVIEWS IN TOXICOLOGY                                                   1
## 2 ENVIRONMENT INTERNATIONAL                                                        1
## 3 ENVIRONMENTAL RESEARCH                                                           1
## 4 ENVIRONMENTAL SCIENCE AND TECHNOLOGY                                             1
## 5 HUMAN REPRODUCTION UPDATE                                                        1
## 6 INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH                1
## 7 JOURNAL OF TOXICOLOGY AND ENVIRONMENTAL HEALTH - PART B: CRITICAL REVIEWS        1
## 8 REPRODUCTIVE TOXICOLOGY                                                          1
## 
## 
## Most Relevant Keywords
## 
##           Author Keywords (DE)      Articles       Keywords-Plus (ID)     Articles
## 1  PERFLUOROOCTANE SULFONATE (PFOS)        2 MALE                               11
## 2  SEMEN QUALITY                           2 ENVIRONMENTAL EXPOSURE             10
## 3  ADVERSE BIRTH OUTCOME                   1 FEMALE                             10
## 4  ADVERSE PREGNANCY OUTCOME               1 PERFLUOROOCTANESULFONIC ACID       10
## 5  CHILDRENS HEALTH                        1 PERFLUOROOCTANOIC ACID             10
## 6  CRYPTORCHIDISM                          1 HUMAN                               7
## 7  DEVELOPMENTAL TOXICITY                  1 PREGNANCY                           7
## 8  ENDOCRINE DISRUPTION                    1 HUMANS                              6
## 9  EPIDEMIOLOGY                            1 POLLUTANT                           6
## 10 FECUNDABILITY                           1 ADOLESCENT                          5
plot(x = results1, k = 10, pause = F) #produces a sequence of standard summary plots displayed in the plots window

#for individual plots assign the above to an object (p <- ) and then call individual plots (1-5, e.g. p[5])
# Publication journals
knitr::kable(table(bib_sco$JI), caption = "Table SX. counts of numbers of publications per journal")
Table SX. counts of numbers of publications per journal
Var1 Freq
CRIT. REV. TOXICOL. 1
ENVIRON. INT. 1
ENVIRON. RES. 1
ENVIRON. SCI. TECHNOL. 1
HUM. REPROD. UPDATE 1
INT. J. ENVIRON. RES. PUBLIC HEALTH 1
J. TOXICOL. ENVIRON. HEALTH PART B CRIT. REV. 1
REPROD. TOXICOL. 1

Making keyword co-occurrence matrix

Thematic map based on keywords

#pdf(file="./plots/figure_thematic_map.pdf", width = 5, height = 5, pointsize = 8, family = "Helvetica", fonts = c("Helvetica"))
par(mfrow=c(1,1), mar=c(0,2,0,2))
map_thematic <- thematicMap(bib_sco, field = "ID", n = 1000, minfreq = 5, stemming = FALSE, size = 0.5, n.labels = 1, repel = TRUE)
plot(map_thematic$map)

#dev.off()

Author collaboration network

Links between review authors based on their co-authorship on these reviews. Currently, there are too few articles included for the connections between the clusters (publications) to be visible, although it looks like one set of authors co-authored two of the eight reviews from the pilot set.

NetMatrix_authors <- biblioNetwork(bib_sco, analysis = "collaboration",  network = "authors", sep = ";")
NetMatrix_authors_plot <- networkPlot(NetMatrix_authors,  n = 50, 
                                      Title = "Author collaboration", 
                                      type = "auto", size = 2, size.cex = TRUE, 
                                      edgesize = 10, labelsize = 1.1) #note there are potentially some mistakes in authors initials

Country collaboration network

Links between countries, based on the institutional affiliations of the review authors.

#pdf(file="./plots/figure_country_collaboration_network.pdf",width = 4, height = 4, pointsize = 12, family = "Helvetica", fonts = c("Helvetica"))
#par(mfrow=c(1,1), mar=c(2,3,2,3))

bib_sco2 <- metaTagExtraction(bib_sco, Field = "AU_CO", sep = ";") #we need to extract countries from the affiliations first
#bib_sco2$AU_CO[1:10]
NetMatrix_country <- biblioNetwork(bib_sco2, analysis = "collaboration", network = "countries", sep = ";")
NetMatrix_country_plot <-  networkPlot(NetMatrix_country, n = 50, 
                                       Title = "Country collaboration", type = "auto", size=TRUE, 
                                       remove.multiple=FALSE, labelsize=1.5)

#dev.off()

Analyzing lists of references (cited works)

# analyzing lists of references 
CR <- citations(bib_sco, field = "article", sep = ";") #list of most cited articles
 cbind(CR$Cited[1:10]) #ten most cited articles
##                                                                                                                                                                                                                                                                                                        [,1]
## AN UPDATED ANALYSIS FROM THE DANISH NATIONAL BIRTH COHORT (2018) INT. J. ENVIRON. RES. PUBL. HEALTH, 15                                                                                                                                                                                                   2
## MENG, Q., INOUE, K., RITZ, B., OLSEN, J., LIEW, Z., PRENATAL EXPOSURE TO PERFLUOROALKYL SUBSTANCES AND BIRTH OUTCOMES                                                                                                                                                                                     2
## VESTED, A., RAMLAU-HANSEN, C.H., OLSEN, S.F., BONDE, J.P., KRISTENSEN, S.L., HALLDORSSON, T.I., BECHER, G., TOFT, G., ASSOCIATIONS OF IN UTERO EXPOSURE TO PERFLUORINATED ALKYL ACIDS WITH HUMAN SEMEN QUALITY AND REPRODUCTIVE HORMONES IN ADULT MEN (2013) ENVIRON HEALTH PERSPECT, 121, PP. 453-458    2
## (2002) GLOBAL ASSESSMENT OF THE STATE-OF-THE-SCIENCE OF ENDOCRINE DISRUPTERS, PP. 1-133. , DAMSTRA T, BARLOW S, BERGMAN A, KAVLOCK R, VAN DER KRAAK G (EDS). GENEVA: WORLD HEALTH ORGANISATION                                                                                                            1
## (2011) ESTIMATION PROGRAMS INTERFACE SUITE FOR MICROSOFT WINDOWS, , USEPA UNITED STATES ENVIRONMENTAL PROTECTION AGENCY, WASHINGTON, DC                                                                                                                                                                   1
## (2012) STATE OF THE SCIENCE OF ENDOCRINE DISRUPTING CHEMICALS, PP. 1-296. , BERGMAN , HEINDEL JJ, JOBLING S, KIDD KA, ZOELLER RT (EDS). GENEVA: WHO-UNEP                                                                                                                                                  1
## (2013) GLOBAL MERCURY ASSESSMENT 2013: SOURCES, EMISSIONS, RELEASES AND ENVIRONMENTAL TRANSPORT, , UNEP. UNITED NATIONS ENVIRONMENT PROGRAMME CHEMICALS BRANCH: GENEVA, SWITZERLAND                                                                                                                       1
## (2013) STATE OF THE SCIENCE OF ENDOCRINE DISRUPTING CHEMICALS - 2012, , SWITZERLAND: UNITED NATIONS ENVIRONMENT PROGRAMME (UNEP) AND THE WORLD HEALTH ORGANIZATION (WHO):. GENEVA                                                                                                                         1
## (2013) STOCKHOLM CONVENTION ON PERSISTENT ORGANIC POLLUTANTS, , UNEP. UNITED NATIONS ENVIRONMENT PROGRAMME: GENEVA, SWITZERLAND                                                                                                                                                                           1
## (2017) SOME CHEMICALS USED AS SOLVENTS AND IN POLYMER MANUFACTURE, , LYON, FRANCE: INTERNATIONAL AGENCY FOR RESEARCH ON CANCER (IARC) WORLD HEALTH ORGANIZATION (WHO), IARC MONOGRAPHS ON THE EVALUATION OF CARCINOGENIC RISKS TO HUMANS VOLUME 110                                                       1
CR <- citations(bib_sco, field = "author", sep = ";")
cbind(CR$Cited[1:10]) #ten most cited authors
##                    [,1]
## TOFT G               20
## BONDE J P            19
## CALAFAT A M          17
## LINDH C H            17
## LOPEZ ESPINOSA M J   15
## OLSEN J              15
## SKAKKEBAEK N E       15
## FLETCHER T           14
## HAUG L S             14
## OLSEN G W            14

Article coupling network: Citation network

# classical article coupling network: Citation network
NetMatrix_coupling <- biblioNetwork(bib_sco, analysis = "coupling", network = "references", sep = "; ")
NetMatrix_coupling_plot <- networkPlot(NetMatrix_coupling, n = 20, 
                                       Title = "Paper co-citation", type = "fruchterman", size=T, 
                                       remove.multiple=FALSE, labelsize=0.8,edgesize = 5) #plot the network

Historical co-citation network

Currently not enough papers to create this network plot.

histResults <- histNetwork(bib_sco, sep = ";") 
## 
## SCOPUS DB: Searching local citations (LCS) by document titles (TI) and DOIs...
## 
## Found 2 documents with no empty Local Citations (LCS)
# not possible now - Found 2 documents with no empty Local Citations (LCS)

# pdf(file="./plots/Figure_hstorical_network.pdf", width=8, height=8, pointsize=10)
# par(mfrow=c(1,1), mar=c(0,0,0,0))
#histPlot(histResults, size=TRUE, arrowsize = 0.2) 
# dev.off()

Cross-objective insights

We will investigate how review type, indicators of review quality or transparency are related other review properties, such as publication date, main review topic and subject, etc. Below, we provide a few examples where data belonging to different objectives is use together to answer these additional questions.

# Review_type_claimed vs. Human_animal_environment contingency table
#table(mdata$Review_type_claimed) #needs cleaning
mdata$Review_type_claimed <- tolower(mdata$Review_type_claimed) 

mdata %>% count(Human_animal_environment, Review_type_claimed) -> t1

# barplot of claimed review type vs. subject

ggplot(t1, aes(x = Review_type_claimed, y = n)) +
 labs(title = expression("Review type and subject")) + #~bold(A.)~' Type and subject'
 coord_flip()  +
 scale_fill_brewer() +
 geom_col(aes(fill = Human_animal_environment), width = 0.7) + 
 theme_light() +
  theme(legend.position = "bottom", legend.box.background = element_rect(colour = "black"), 
        legend.title = element_blank(),   legend.text=element_text(size = 4), 
        axis.title.x = element_text(size = 10), axis.title.y = element_blank())

#ggsave(here("plots","figure_barplot_type_subject.pdf"), width = 4, height = 4, units = "cm", scale = 2, device = cairo_pdf)
## Code for other simple plots with two crossed variables (not run for pilot data)

# Barplot example Review_type_claimed vs. Human_animal_environment contingency table
mdata %>% count(Human_animal_environment, Review_type_claimed) -> t1

ggplot(t1, aes(x = Human_animal_environment, y = n)) +
 coord_flip()  +
 theme(legend.position = "top") +
 geom_col(aes(fill = Review_type_claimed), width = 0.7)

ggplot(t1, aes(x = Review_type_claimed, y = n)) +
 coord_flip()  +
 theme(legend.position = "top") +
 geom_col(aes(fill = Human_animal_environment), width = 0.7)

# Heatmap example (it will look better with more data points)
t2 <- xtabs(~Review_type_claimed+Human_animal_environment, data = mdata)  #contingency table
#heatmap(t2, Colv = NA, Rowv = NA, scale="column", col=cm.colors(256), margins = c(15, 25)) #heatmap
heatmap(t2, Colv = NA, Rowv = NA, scale="column", col=heat.colors(12, rev=TRUE, alpha = 0.6), margins = c(15, 25)) #heatmap

#different vesion with side colour bars
#rc <- rainbow(nrow(t2), start = 0, end = .3)
#cc <- rainbow(ncol(t2), start = 0, end = .3)
#heatmap(t2, Colv = NA, Rowv = NA, scale="column", RowSideColors = rc, ColSideColors = cc, margins = c(15, 25)) #heatmap