====== UPORTO DATA ARCHIVE (UDA) ======
Perdigão's datasets repository using the [[http://www.unidata.ucar.edu/software/thredds/current/tds/|THREDDS Data Server (TDS)]]
* **most recent info at the [[https://docs.google.com/document/d/1qBJk-eVFiO_A1f6n3IEd99q7GxZJtlKHM5LAOZU4-Do|docs.google.com/document]]**
* [2020.05.23] TDS password removed
* [2020.05.23] cron jobs removed and the mirroring process stopped {{closing_history.tgz|DTU account closing status}}
===== - Using the UDA =====
The UPORTO Data Archive (UDA) may be [[https://windsptds.fe.up.pt/thredds/catalog_perdigao.html|accessed using the THREDDS Data Server (TDS)]] by providing the same credentials as in the UCAR ftp site (perdigao / Bxxxxx!).
WindsP App users may [[https://windsp.fe.up.pt/experiments/3/datasets/~2Fthredds~2Fcatalog_perdigao.xml|explore the UPORTO Data Archive (UDA)]] but, when they request access to data or meta-data that is in the Data Archive, they have to provide the TDS credentials (during the embargoed period of 12 months).
{{uporto_data_archive_20171120_.pdf|}} | {{ :uda:uporto_data_archive_2020-05-23.pdf |}}
===== - UDA contents =====
^ Export ^ Size ^ Last 24 hours ^
| DLR | 1.2 TiB | 2.32553 GiB |
| DTU | 2.0 TiB | 0.000537872 GiB |
| INEGI | 301 MiB | 0 GiB |
| UCAR | 1.6 TiB | 6.24173 GiB |
| WINDFORS | 3.2 GiB | 0 GiB |
Summary on 20-03-2018 00:00.
For more info see the [[http://winds.fe.up.pt/datalogs/?C=N;O=D|datalogs details]].
perdigao@windsptds:/data$ tree -d -L 3 /data/perdigao
data/perdigao
├── dlr
│ ├── HATPRO_level-1
│ │ ├── 201704
│ │ ├── 201704_quicklook
│ │ ...
│ ├── HATPRO_level-2
│ │ ├── 201704
│ │ ├── 201704_quicklook
│ │ ...
│ ├── HATPRO_surface-met
│ │ ├── 201704
│ │ ├── 201704_quicklook
│ │ ...
│ ├── mcs_data
│ │ ├── 20170430095732
│ │ ...
│ ├── netcdf_lidar
│ │ ├── DLR85
│ │ ├── DLR86
│ │ ├── DLR89
│ │ └── readme.tx
│ │── raw_data
│ │ ├── DLR85
│ │ ├── DLR86
│ │ └── DLR89
│ └── sound
│ ├── mic1
│ ├── mic2
│ ├── mic3
│ ├── mic4
│ ├── mic5
│ └── microphone_position.txt
├── dtu
│ ├── data
│ │ ├── DTU_Leica_Scanning
│ │ ├── DTU_Mast_Data
│ │ └── DTU_WindScanner
│ ├── docs
│ ├── landscape
│ ├── photos
│ └── plots
│ └── DTU_WindScanner
├── inegi
│ ├── EnerconWindTurbine
│ ├── LeosphereWindcube
│ │ └── 01_RawData
│ └── LidarAerialSurvey_RawData
│ ├── Images
│ ├── PerdigaoTurbineTopView.pdf
│ ├── PointCloud
│ └── Portugal Laserscanning Report.pdf
├── ucar
│ ├── arl
│ │ ├── ARL_Scanning_Lidar_George_Site
│ │ ├── ARL_Scanning_Lidar_Lionstail_Site
│ │ └── ARL_Scintillometer
│ ├── colorado
│ │ └── CU_Lidar
│ ├── eol
│ │ └── WV-DIAL
│ ├── isfs
│ │ ├── hr_noqc_geo
│ │ └── noqc_geo_notiltcor
│ ├── ncas
│ │ └── NCAS_profiler
│ ├── notredame
│ │ ├── UND_Ceilometer
│ │ ├── UND_Radiosonde
│ │ ├── UND_Scanning_Lidar_Lionshead_Site
│ │ ├── UND_Scanning_Lidar_MI6_Site
│ │ ├── UND_Scanning_Lidar_Orange_Site
│ │ └── UND_SODAR_RASS
│ └── oklahoma
│ ├── CLAMPS_AERI
│ ├── CLAMPS_MWR
│ └── CLAMPS_Scanning_Lidar
└── windfors
├── 2017
│ ├── 201704
│ ├── 201705
│ └── 201706
└── cross
└── 2017
441 directories, 8 files
===== - Building the UDA =====
Each institution, collecting data in the Perdigão experiment, also owns credentials to upload and maintain their data in their catalogue in UDA (rsync exports).
Available exports:
nejoco@VIND-pNEWA04:~> rsync -rdt rsync://windsptds.fe.up.pt
test RSYNC test
archive RSYNC UDA FILES (read only)
ucar RSYNC UCAR FILES
dtu RSYNC DTU FILES
inegi RSYNC INEGI FILES
dlr RSYNC DLR FILES
windfors RSYNC WindForS FILES
===== - Upload DTU data =====
UPORTO (as ''nejoco@login.neweuropeanwindatlas.eu'') uses the UDA export ''dtu@windsptds.fe.up.pt::dtu'' to sync data collected by DTU.
First a complete mirror was in place, by automatically syncing every 4 hours the DTU data directory using a cron job:
''/usr/bin/rsync -az --delete /newa/WP2/PERDIGAO/ dtu@windsptds.fe.up.pt::dtu''.
Later the ''--delete'' option was removed and some directories excluded to achieve the Perdigão Data Archive at UDA.
# DTU data sync to UDA, At minute 31 past every 4th hour
#31 */4 * * * /usr/bin/rsync -az --delete /newa/WP2/PERDIGAO/ dtu@windsptds.fe.up.pt::dtu > /dev/null 2>&1
31 */4 * * * /usr/bin/rsync -az --exclude-from 'sync-exclude-list' /newa/WP2/PERDIGAO/ dtu@windsptds.fe.up.pt::dtu > /dev/null 2>&1
cat ~nejoco/sync-exclude-list
archive/
data/DLR_WindScanner/
===== - Upload UCAR data =====
UCAR uses the UDA export ''ucar@windsptds.fe.up.pt::ucar'' to copy NCAR/EOL ISFS data.
===== - Upload DLR data =====
DLR uses the UDA export ''dlr@windsptds.fe.up.pt::dlr'' to maintain the DLR data.
===== - Upload INEGI data =====
INEGI uses the UDA export ''inegi@windsptds.fe.up.pt::inegi'' to maintain the ENERCON data and "Lidar Aerial Survey Data".
===== - Upload WindsForS data =====
WindsForS uses the UDA export ''windfors@windsptds.fe.up.pt::windfors'' to maintain the WindForS data.
===== - Mirror UCAR ftp site =====
Preliminary data at the ftp site (ARL, Notre dame, ...) uploaded with:
#! /bin/sh
dir=arl
source=ftp://ftp.eol.ucar.edu/pub/data/incoming/perdigao/uda/$dir
destination=/data/perdigao/ucar
nohup wget -m -nH --cut-dirs=5 -P $destination $source >| /dev/null 2>&1 &
Afterwards it is verified by running in the ftp site:
#! /bin/sh
dir=arl
cd /incoming/perdigao/uda
export RSYNC_PASSWORD=t****YLa****
archive=ucar@windsptds.fe.up.pt::ucar/$dir
rsync -avz --delete --dry-run $dir $archive
===== - Mirror UDA to DTU =====
The UPORTO Data Archive (UDA) is automatically synced to the DTU, every 24 hours, using the UDA read only export: ''uda@windsptds.fe.up.pt::archive/'', using a cron job.
# UDA archive to DTU, At midnight every day
0 0 * * * /home/nejoco/sync-uda.sh >| sync-uda_last.log 2>&1
#! /bin/sh
# the Perdigao root at NEWA storage
perdigao=/newa/WP2/PERDIGAO
# the archive root
archive=$perdigao/archive
# the actual size of the archive
echo "Total du of $archive:"
du -ks $archive
# the UDA readonly password
export RSYNC_PASSWORD=-password-
# catalogues to sync
CATALOGS="dlr inegi ucar windfors"
for c in $CATALOGS; do
# mirror catalog from the version at UDA (UPORTO)
echo; echo "$(tr [a-z] [A-Z] <<< "$c"):"
#cmd="rsync -avz uda@windsptds.fe.up.pt::archive/$c/ $archive/$c/"
cmd="rsync -avz --delete uda@windsptds.fe.up.pt::archive/$c/ $archive/$c/"
echo "$cmd..."
# do it
$cmd
done
# catalog structure
echo
tree -L 2 $archive
# total space usage for each archive
echo
du -khs $archive/*
# the final size of the archive
echo
echo "Total du of $archive:"
du -ks $archive
# the end
echo
echo "Done."
The DTU NEWA directory /newa/WP2/PERDIGAO/archive/ contains an exact copy of UDA, except for the DTU data that are links to existing NEWA directories (in order to avoid using a duplication 2.0 TiB of storage).
/newa/WP2/PERDIGAO/archive
├── dlr
│ ├── HATPRO_level-1
│ ├── HATPRO_level-2
│ ├── HATPRO_surface-met
│ ├── mcs_data
│ ├── netcdf_lidar
│ └── raw_data
├── dtu
│ ├── DTU_Leica_Scanning -> /newa/WP2/PERDIGAO/data/DTU_Leica_Scanning
│ ├── DTU_Mast_Data -> /newa/WP2/PERDIGAO/data/DTU_Mast_Data
│ └── DTU_WindScanner -> /newa/WP2/PERDIGAO/data/DTU_WindScanner
├── inegi
│ ├── EnerconWindTurbine
│ ├── LeosphereWindcube
│ └── LidarAerialSurvey_RawData
├── ucar
│ ├── isfs
│ ├── iss
│ └── ncas
└── windfors
├── 2017
└── cross
===== - Current status =====
There is a collaborative version of this table being updated at Google docs.
{{ :uda:20171120_perdigao_archiving.pdf |snap at 22/12/2017}} |
{{ :uda:20180320_perdigao-archives.pdf |snap at 20/03/2018}} | {{ :uda:2020-05-23_perdigao-archives.pdf |snap at 23/05/2020}}
--- //[[jlopes@fe.up.pt|Correia Lopes]] 2017/11/17 11:10//