# Training camp I

2017 Spring

Brushing up programming skill!

## Objective

To brush up your programming skill.

## Schedule and place

### Duration

26th-28th May 2017

### Time table

 26th 09:00-09:10 Opening talk (Dr. Naho Orita). 09:10-17:00 Practical work (12:00-13:00 Lunch break) 27th 10:00-17:00 Practical work (12:00-13:00 Lunch break) 28th 10:00-16:00 Practical work (12:00-13:00 Lunch break) 16:00-17:00 Presentation

### Place

 Class Room 412, GSIS building

### Description

In this class, you are expected to solve programming problems listed below. The problems include basic problems and option problems. Solve all of the basic problems. Choose three problem from option problems and solve it.

### Programming language

Use Python or R for basic problems. For the option problems, you can use any language. If you use compiler language, make sure that your program operates correctly in other environment and if you use not standard libraries with your program, bundle the libraries with your code or clearly state how to include or introduce the libraries to other environment.

### Code

#!/usr/bin/env python

def main():
# body of the program

if __name__ == '__main__':
main()


### Terminal

Terminal will be shown by the following box. Here, "$" stands for a prompt. $ ls


### Text

Text will be shown by the following box.

This is a text file.


### Structure of Python code

Write a body of your program in the "main()" function of the code. Use shebang. Basically, do not write anything except "main()" beneath "if __name__ == '__main__':".

#!/usr/bin/env python
import numpy as np

def main():
# body of the program

if __name__ == '__main__':
main()


### Structure of R code

Write a body of your program in the "main()" function of the code. Use shebang.

#!/usr/bin/env Rscript

main=function()
{
# body of the program
}

main()


14


### Problem 002

Make a program, which calculates $15^9$ and print the result to the screen.

### Problem 003

Make a program, which calculates a remainder of $100/7$ and print it to the screen.

### Problem 004

Make a program, which generates a variable $a=9$, $b=7$ and $c=4$, calculates $(a+b)^c$ and print the result to the screen.

### Problem 005

Make a program, which prints "Hello" to the screen.

### Problem 006

Make a program, which generates variables $a$ and $b$ containing "Hello" and "World" respectively, and prints these variables to the screen on the same line, separating the variables with a space.

Execution example
0
1
2
.
.
.
99


### Problem 008

Make a program, which prints numbers from 0 to 10 separated by "," on a line.

Execution example
B


### Problem 010

Make a program, which prints "A-B-O-AB". To do this, use "join" function.

### Problem 011

Make a program, which generates a list variable $a$ = [10, 20, 40, 80, 30] and prints sorted $a$ to the screen.

Execution example
2


### Problem 013

Make a program, which generates a dictionary variable $a$ = {"January":1, "February":2, "May":5} and prints keys of $a$ sorted by value of the dictionary in descending order. In this time, use "for" function.

Execution example
$bp013.py May February January  ### Problem 014 Make a program, which generates a list variable$a$= [10, "20", 40, "80", 30] and prints only numerical values from all elements. In this time, use "for" and "if" functions. ### - Function ### Problem 015 Implement a function to calculate summation of a list variable$a$= [10, 20, 40, 80, 30]. Example code #!/usr/bin/env python def main(): a=[10, 20, 40, 80, 30] print(mysum(a)) def mysum(lix): # body of the function if __name__ == '__main__': main()  ### Problem 016 Implement a function to sort input list variable$a$= [10, 20, 40, 80, 30] in descending order. Example code #!/usr/bin/env python def main(): a=[10, 20, 40, 80, 30] print(mysort(a)) def mysort(lix): # body of the function if __name__ == '__main__': main()  ### Problem 017 Implement a function to calculate mean of a list variable$a$= [10, 20, 40, 80, 30]. Example code #!/usr/bin/env python def main(): a=[10, 20, 40, 80, 30] print(mymean(a)) def mymean(lix): # body of the function if __name__ == '__main__': main()  ### Problem 018 Implement a function to calculate a variance of a list variable$a$= [10, 20, 40, 80, 30]. Example code #!/usr/bin/env python def main(): a=[10, 20, 40, 80, 30] print(myvariance(a)) def myvariance(lix): # body of the function if __name__ == '__main__': main()  ### Problem 019 Implement a function to calculate a (unbiased estimated) standard deviation of a list variable$a$= [10, 20, 40, 80, 30]. Example code #!/usr/bin/env python def main(): a=[10, 20, 40, 80, 30] print(mysd(a)) def mysd(lix): # body of the function if __name__ == '__main__': main()  ### Problem 020 Implement a function to calculate the Fibonacci numbers. Print 10th Fibonacci number to the screen. Example code #!/usr/bin/env python def main(): n=10 print(myfibo(n)) def myfibo(n): # body of the function if __name__ == '__main__': main()  ### Problem 021 Make a program, which accepts numeric ($n$) as an input from keyboard and prints Fibonacci number corresponding to the input$n$. Execution example $ bp021.py
Input a number:

After inputting 13 from keyboard
180


### Problem 023

Make a program, which generates a list variable $a$ = [10, 20, 40, 80, 30] and calculates mean of all elements in the list by numpy.

### Problem 024

Make a program, which generates a list variable $a$ = [10, 20, 40, 80, 30] and calculates standard deviation of all elements in the list by numpy.

### Problem 025

Make a program, which generates a list variable $a$ = [10, 20, 40, 80, 30] and finds maximum value of the list by numpy.

### Problem 026

Make a program, which generates a list variable $a$ = [10, 20, 40, 80, 30] and finds an index of minimum value (argmin) of the list by numpy.

### Problem 027

Make a program, which generates a list variable $a$ = [0, 2, 6, 8, 9, 10, 14, 16] and $b$ = [35, 41, 21, 14, 6, 2, 0, 2] and draws a scatter plot with $a$ as horizontal and $b$ as vertical axis. To do this, a module "matplotlib" is useful.

Example code
#!/usr/bin/env python
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

def main():
lia=[  0,  2,  6,  8,  9, 10, 14, 16]
lib=[ 35, 41, 21, 14,  6,  2,  0,  2]
# Draw scatter plot and save it as png format file.

if __name__ == '__main__':
main()

Output example

### Problem 028

Make a program, which generates a list variable $a$ = [35, 41, 21, 14, 6, 2, 0, 2] and draws a bar plot with $a$ as horizontal axis.

Output example

### Problem 029

Make a program, which prints $\log_{10}{10}$, $\log_{2}{10}$, $\ln{10}$ and $\cos{180^\circ}$ to the screen. To do this, a module "math" is useful.

Execution example
$bp029.py log_10(10) 1.0 log_2(10) 3.321928094887362 log_e(10) 2.302585092994046 cos(180) -1.0  ### - Multidimensional array ### Problem 030 Make a program, which generates a two dimensional list variable$a$, which contains 100 of list [0, 1, 2, ..., 19] (=range(20)) inside and prints it to the screen. Execution example $ bp030.py
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
.
.
.
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]


### Problem 031

Make a program, which generates a two dimensional list variable $a$, which contains 100 of list [0, 1, 2, ..., 19] (=range(20)) inside and one dimensional list variable $b$ = [0, 1, 2, ..., 9], merge them ($a$ and $b$) and prints the resultant list to the screen. For this, "append" function is useful.

Execution example
19000


### Problem 033

Make a program, which generates a numpy matrix variable $a$ = $\pmatrix{10 & 15 \\30 & 25}$ and prints it to the screen.

Execution example
[15 25]


### Problem 035

Make a program, which generates a numpy matrix variable $a$ = $\pmatrix{10 & 15 \\30 & 25}$ and $b$ = $\pmatrix{4 & 15 \\-11 & 5}$, calculates an element-wise product of $a$ and $b$ and prints the product to to the screen.

### Problem 036

Make a program, which generates a numpy matrix variable $a$ = $\pmatrix{10 & 15 \\30 & 25}$ and $b$ = $\pmatrix{4 & 15 \\-11 & 5}$, calculates a product of $a$ and $b$ and prints it to to the screen.

### Problem 037

Make a program, which generates a numpy matrix variable $a$ = $\pmatrix{10 & 15 \\30 & 25}$, calculates sum, mean, variance and standard deviation of the elements and prints them to the screen.

Execution example
$bp037.py sum 80 mean 20.0 var 62.5 std 7.90569415042  ### Problem 038 Make a program, which generates a numpy matrix variable$a$=$\pmatrix{4 & 15 & 4 \\ -11 & 5 & 6 \\ 2 & 4 & 8}$, calculates a determinant and inverse matrix respectively and prints them to the screen. ### Problem 039 Make a program, which generates a numpy matrix variable$a$=$\pmatrix{3 & 4 & 1 & 4 \\ 1 & 2 & 1 & 1 \\ 1 & 1 & 2 & 1\\ 1 & 1 & 1 & 2}$, calculates eigenvalues and eigenvectors for$a$and prints them to the screen. ### Problem 040 Make a program, which calculates a power root of matrix$a$=$\pmatrix{3 & 4 & 1 & 4 \\ 1 & 2 & 1 & 1 \\ 1 & 1 & 2 & 1\\ 1 & 1 & 1 & 2}$, namely$a^{\frac{1}{2}}$, and prints the result to the screen. To do this, use following formula. Here,$P$and$D$is a matrix of eigenvectors of$A$and a diagonal matrix of the eigenvalues. \begin{eqnarray*}A^n=PD^nP^{-1}\end{eqnarray*} Execution example $ bp040.py
[[ 1.46284538  1.39196824 -0.00171605  1.39196824]
[ 0.3091356   1.20551839  0.36094421  0.20551839]
[ 0.3091356   0.20551839  1.36094421  0.20551839]
[ 0.3091356   0.20551839  0.36094421  1.20551839]]


### Problem 041

Make a program, which generates 10 float values (a list variable), which are distributed under $N(0,5^2)$ and prints it.

### Problem 042

Make a program, which generates $10 \times 10$ dimensional numpy matrix $a$ filled by random numbers between 0 and 1 ($[0,1]$) and prints it.

### Problem 043

Make a program, which randomly samples 10 numbers from integer list variable, [0, 1, 2, ..., 999] (=range(1000)) and prints the sampling numbers.

### - File handling and string processing

For problems 044 - 050, use the file, wikipedia_ubc.txt.

### Problem 044

Make a program, which reads the input file, wikipedia_ubc.txt from command line and finds numbers which are surrounded by square bracket, '[' and ']', counts the numbers and prints the total counts to the screen.

### Problem 045

Make a program, which reads the input file and extracts all URL from it and outputs them to the screen.

Execution example
artifacts
Act
.
.
.


58,945
58,000
535,000
100,000
.
.
.


### Problem 049

Make a program, which reads the input file, counts the number of occurrences of "America", "British Columbia", "Canada", "Okanagan", "UBC" and "Vancouver" respectively (Count exact match and do not count "American", "Canadas" or etc!!), draws a bar graph for the counts (Y-axis: The number of counts, X-axis: each value) and outputs it as png image file.

### Problem 050

Make a program, which reads the input file, replaces a string, "programme" with "program" and outputs the resultant text into the screen.

Execution example
A       1116
C       455
G       499
T       896
AA      510
AC      139
.
.
.
TTG     28
TTT     190


### Problem S002 (machine learning)

For this problem, use the CSV file s002.csv.

A: Make a program, which reads the csv file and calculates statistical information (minimum, maximum and mean) for designated variable (column name) from command line argument.

Execution example
$sp002a.py s002.csv atndrte ***statistic of atndrte*** minimum: 6.25 maximum: 100 mean: 81.70956  B: Make a program, which estimates the following regression model and outputs the coefficients of it. \begin{eqnarray*}\rm{atndrte}=\beta_0+\beta_1priGPA+\beta_2ACT+u\end{eqnarray*} C: Make a program, which predicts the value of$atndrte$, if$priGPA=3.65$and$ACT=20$. Description of provided data attend termGPA priGPA ACT final atndrte hwrte frosh soph skipped stndfnl Obs: 680 1. attend classes attended out of 32 2. termGPA GPA for term 3. priGPA cumulative GPA prior to term 4. ACT ACT score 5. final final exam score 6. atndrte percent classes attended 7. hwrte percent homework turned in 8. frosh =1 if freshman 9. soph =1 if sophomore 10. skipped number of classes skipped 11. stndfnl (final - mean)/sd  ### Problem S003 (bioinformatics) Make a program, which reads a nucleic sequence in a FASTA format file such as s003.fa, and translates the sequence to amino acid sequences. The output must be FASTA format. Execution example $ sp003.py s003.fa
>s003
PSRAFWREEEEEEVGGGP*


### Problem S004 (numerical calculation)

Consider a variable ${\bf x}=(2,4,6,8)$. For this $\bf x$, we want to calculate $a$, which minimizes $y=\sum_{i=1}^{4}(x_i-a)^2$. Make a program, which implements a gradient descent method to find this $a$.

Execution example
$sp004.py Predicted a is 5.0  ### Problem S005 (numerical calculation) Consider a variable${\bf x}=(2,4,6,8)$. For this$\bf x$, we want to find${\bf w}=(w_1,w_2,w_3,w_4)$, which minimizes$y=\sum_{i=1}^{4}((-1)^{i}x_i-w_i)^2$. Make a program, which implements (1+1)-ES (evolution strategy) algorithm to find this$\bf w$. Execution example $ sp005.py
Predicted w is (-2.0, 4.0, -6.0, 8.0)


### Problem S006 (bioinformatics)

Make a program, which reads the contig file, s006.txt and calculates the number of contig, whole genome size, mean length of the contig, maximum length of the contig, minimum length of the contig, N50 and GC content (%) respectively and output it to the screen.

Execution example
$sp006.R s006.txt Contigs number 4,021 Total length 4,040,018 Mean length 1,004.7 Max length 3,996 Min length 172 N50 1,029 GC (%) 0.34299  ### Problem S007 (statistical test) Make a program, which reads a file, s007.txt and conducts Tukey-Kramer test to clarify whether there are statistical significance among the data (instance A - F) or not. Execution example $ sp007.R s007.txt
Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = score ~ group)

Output example