NZ 2017 - Step 06
Create Phyloseq and Fasta files

1 Steps
2 Directory structure
3 Files used
4 Processing

Load the variables common to the different scripts and the necessary libraries

source("00-NZ_2017_init.R", echo = FALSE)

1 Steps

Read Mothur database file (Excel format)
Creating phyloseq_mothur and fasta files.
Reassign the otus using dada2
Create a new phyloseq_dada2 file based on the dada2 assignement.

2 Directory structure

/blast : BLAST output
/mothur : mothur output

3 Files used

NZ_2017_mothur.xlsx : contains the mothur database file with otus at 98% similarity
phyloseq_nz_2017_mothur.rds : phyloseq file based on the mothur taxonomic assignement
nz_2017_otu_mothur_taxo.fas : otu fasta file with mothur taxo on defination line
nz_2017_otu_mothur.fas : otu fasta file without taxonomy
phyloseq_nz_2017_dada2.rds : phyloseq file based on dada2 taxonomic assignement

4 Processing

4.1 To do

Note for mater : Make a function * phyloseq_import_mothur(phyloseq_file, excel_file, otu_sheet, sample_sheet, sample_reg_ex, taxo_reg_ex).

4.2 Read the Mothur Excel file

mothur_file_xls <- paste0(path_files, "NZ_2017_mothur.xlsx")
samples_metadata <- read_excel(mothur_file_xls, sheet = "samples")
otu_database <- read_excel(mothur_file_xls, sheet = "otus_0.98")

4.3 Input data from XLS file

# 1. samples table : row names are labelled by pcr_codes
samples_df <- samples_metadata
row.names(samples_df) <- samples_df$sample_code_mothur

# 2. otu table : select only OTU column and abundance columns (Use a regular
# expression...)
otu <- otu_database %>% select(OTUNumber, matches("^(BT|CS|CTD|MS|N|UW|STR)."))
row.names(otu) <- otu$OTUNumber
otu <- otu %>% select(-OTUNumber)

# 3. Taxonomy table
tax <- otu_database %>% select(OTUNumber, starts_with("Taxo"))
row.names(tax) <- tax$OTUNumber
tax <- tax %>% select(-OTUNumber)

4.4 Create and save to phyloseq object

# Transform into matrixes
# ---------------------------------------------------
otu_mat <- as.matrix(otu)
tax_mat <- as.matrix(tax)

# Transform to phyloseq object and save to Rdata file
# --------------------------
OTU = otu_table(otu_mat, taxa_are_rows = TRUE)
TAX = tax_table(tax_mat)
samples = sample_data(samples_df)

phyloseq_mothur <- phyloseq(OTU, TAX, samples)
saveRDS(phyloseq_mothur, file = "phyloseq_nz_2017_mothur.rds")  # Can be loaded with readRDS(file = 'phyloseq_nz_2017_mothur.rds')

4.5 Save sequences in fasta files with and without mothur taxonomy

File in compressed format is called nz_2017_otu_mothur.fas.gz in the /mothur directory

# 4. sequences
otu_sequence <- otu_database %>% select(OTUNumber, matches("^Taxo."), repSeq) %>% 
    transmute(seq_name = OTUNumber, sequence = repSeq, supergroup = Taxo2, division = Taxo3, 
        class = Taxo4, order = Taxo5, family = Taxo6, genus = Taxo7, species = Taxo8)
fasta_write(otu_sequence, str_c(path_mothur, "nz_2017_otu_mothur_taxo.fas"), 
    compress = FALSE)

[1] TRUE

fasta_write(otu_sequence, str_c(path_mothur, "nz_2017_otu_mothur_taxo.fas.gz"), 
    compress = TRUE)

[1] TRUE

fasta_write(otu_sequence, str_c(path_mothur, "nz_2017_otu_mothur.fas"), compress = FALSE, 
    taxo_include = FALSE)

[1] TRUE

fasta_write(otu_sequence, str_c(path_mothur, "nz_2017_otu_mothur.fas.gz"), compress = TRUE, 
    taxo_include = FALSE)

[1] TRUE

4.6 Reassign the sequences using dada2 assigner

See the BLAST analysis file (07). Mothur assingment add Unassigned which is really a problem for next step with phyloseq_mothur, so the otus are reassigned with dada2 and the phyloseq file written again with the new taxonomy (file “phyloseq_dada2_nz_2017.rds”)

tax_list_dada2 <- dada2_assign(seq_file_name = str_c(path_mothur, "nz_2017_otu_mothur.fas"))

4.7 Write a new phyloseq file with taxonomy assigned by dada2

tax <- read_tsv(str_c(path_mothur, "nz_2017_otu_mothur.dada2.taxo"))
row.names(tax) <- tax$seq_name
tax <- tax %>% select(-seq_name)
tax_mat <- as.matrix(tax)
TAX = tax_table(tax_mat)
phyloseq_dada2 <- phyloseq(OTU, TAX, samples)
saveRDS(phyloseq_dada2, file = "phyloseq_nz_2017_dada2.rds")  # Can be loaded with readRDS(file = 'phyloseq_nz_2017_dada2.rds')

NZ 2017 - Step 06 Create Phyloseq and Fasta files

NZ 2017 - Step 06 Create Phyloseq and Fasta files

1 Steps

2 Directory structure

3 Files used

4 Processing

4.1 To do

4.2 Read the Mothur Excel file

4.3 Input data from XLS file

4.4 Create and save to phyloseq object

4.5 Save sequences in fasta files with and without mothur taxonomy

4.6 Reassign the sequences using dada2 assigner

4.7 Write a new phyloseq file with taxonomy assigned by dada2

NZ 2017 - Step 06
Create Phyloseq and Fasta files

NZ 2017 - Step 06
Create Phyloseq and Fasta files