Linux Essential - Training Course - Minh, Inc. Software development

Linux Essential - Training Course - Minh, Inc. Software development
Linux Essentials
Linux Essential - Training Course
Day 1 Morning
1. Introduction to Linux
●GNU Project / GPL Licensing
●Evolution of Linux & Development
Model
●Device Identities in Linux - Partitioning
Schema
3. Shell Commands & Shell Scripting
●Basic Shell commands
●Bash Shell Essentials
- Introduction
- Process
- Redirection
- Shell Programming
- Programming Commands
- Advance Shell Programming
- Function
- Array
- I/O Redirection and file descriptor
- Local and Global variables
- Conditional Execution
●Creating Makefiles
2. Introduction to Kernel
●History of Linux
●Types of Kernel
●The Linux kernel
●Kernel Architecture
Day 1 - Afternoon
Lab
●Implement quick sort in Shell
Programming
Day 2 Morning
4. Creating Libraries
●Creating Static Library
- Using Static Library
●Creating Shared Library
- Using Shared Library
5. The Boot Process
●BIOS Level
- Boot Loader – Setup
- startup_32 functions
●The start_kernel() function
Day 2 - Afternoon
6. The File System
●Virtual File system & its role
●Files associated with a process
●proc file system
●System Calls
Lab ●Implement late binding
●Create hard link
●Create soft link
●Write a program to enumerate stat
structure for both hard link and soft link.
Illustrate which field is different
Day 3 - Morning
7. Process Management
●Process Defined
●Process Descriptor Structures in the
kernel
●Process States
●Process Scheduling
●Process Creation
●System calls related to process
management
8. Memory Management
●Defining and Creating secondary memory
areas
●Memory allocation & deallocation system
calls malloc,calloc, alloca, free
●Demand Paging defined
●Process Organization in Memory
●Address Translation and page fault
handling
●Virtual Memory Management
Day 3 - Afternoon
Labs
●Create a child process and validates if all
open descriptors are copied to child
process also.
- Use file seek from parent and see child's
descriptor also got seeked.
Day 4 - Morning
9. Multi Thread Programming
●Creating multiple threads
●Parent synchronization with other Thread
10. Inter process communication
●Pipes, Fifo's, signals - System-V IPC's
●Message queues - Shared memory Semaphores
Day 4 - Afternoon
Labs
●Write a multi threaded application and
check if global variables are shared.
- Protect them using semaphores
Day 5 ­ Morning
11. Sockets
●An Overview
●System calls related to TCP and UDP
socket
13. Programming and Debugging Tools
●strace – Tracing system calls
●ltrace – Tracing library calls
●Tools used to detect memory access
error ; and memory leakage in linux
mtrace
●Using gdb and ddd utilities
●Core dump analysis etc...
12. Network Programming
●TCP Server Client Programming
●UDP Server Client Programming
●Netlink socket interface
Day 5 ­ Afternoon
Labs
●Write a program to create a UDP socket
which writes data to it, also write a UDP
receiver to it.
DISCLAIMER
This document is edited on Cent OS 5 using Open Office 3.1.1 Draw Package.
CentOS is freely download from centos.org/download
Open Office 3.1.1 can be obtained through yum or through openoffice.org
Text of this document is written in Bembo Std Otf (13 pt) font.
Code parts are written in Consolas (10 pts) font.
This training material is provided through Minh, Inc., B'lore, India
Document is available at minhinc.com/training/advance-linux-slides.pdf
For suggestion(s) or complaint(s) write to us at [email protected]
Document modified on 06/2017
Document contains 178 pages.
Day 1 - Morning
1. Introduction to Linux
●GNU Project / GPL Licensing
Evolution of Linux & Development Model
Device Identities in Linux - Partitioning Schema
●
●
a) You must cause the modified files to carry prominent notices stating that you
changed the files and the date of any changes.
b) You must cause any work that you distribute or publish, that in whole or in part
contains or is derived from the Program or any part thereof, to be licensed as a whole
at no charge to all third parties under the terms of this License.
c) If the modified program normally reads commands interactively when run, you
must cause it, when started running for such interactive use in the most ordinary way,
to print or display an announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide a warranty) and that
users may redistribute the program under these conditions, and telling the user how to
view a copy of this License. (Exception: if the Program itself is interactive but does not
normally print such an announcement, your work based on the Program is not
required to print an announcement.)
1. Introduction to Linux
GNU Project / GPL Licensing
●
Evolution of Linux & Development
Model
●
Device Identities in Linux - Partitioning Schema
●
* 1991: The Linux kernel is publicly announced on 25 August by
the 21-year-old Finnish student Linus Benedict
Torvalds.^[13]
* 1992: The Linux kernel is relicensed under the GNU GPL. The
first Linux distributions are created.
* 1993: Over 100 developers work on the Linux kernel. With
their assistance the kernel is adapted to the GNU
environment, which creates a large spectrum of application
types for Linux. The oldest currently (as of 2015) existing
Linux distribution, Slackware, is released for the first
time. Later in the same year, the Debian project is
established. Today it is the largest community distribution.
* 1994: Torvalds judges all components of the kernel to be fully matured:
he releases version 1.0 of Linux. The XFree86 project contributes a
graphical user interface (GUI). Commercial Linux distribution makers
Red Hat and SUSE publish version 1.0 of their Linux distributions.
* 1995: Linux is ported to the DEC Alpha and to the Sun SPARC.Over
the following years it is ported to an ever greater number of platforms.
* 1996:Version 2.0 of the Linux kernel is released. The kernel can now
serve several processors at the same time using symmetric multiprocessing
(SMP), and thereby becomes a
serious alternative for many companies.
* 1998: Many major companies such as IBM, Compaq and Oracle announce
their support for Linux. The Cathedral and the Bazaar is first published as
an essay (later as a book), resulting in Netscape publicly releasing the
source code to its Netscape Communicator web browser suite. Netscape's
actions and crediting of the essay^[50] brings Linux's open source
development model to the attention of the popular technical press. In
addition a group of programmers begins developing the graphical user
interface KDE.
* 1999: A group of developers begin work on the graphical environment
GNOME, destined to become a free replacement for KDE, which at the
time, depends on the, then proprietary, Qt toolkit. During the year IBM
announces an extensive project for the support of Linux.
* 2000: Dell announces that it is now the No. 2 provider of Linux-based
systems worldwide and the first major manufacturer to offer Linux
across its full product
* 2002: The media reports that "Microsoft killed Dell Linux"^[52]
* 2004: The XFree86 team splits up and joins with the existing X standards
body to form the X.Org Foundation, which results in a substantially faster
development of the X server for Linux.
* 2005: The project openSUSE begins a free distribution from Novell's
community. Also the project OpenOffice.org introduces version 2.0 that
then started supporting OASIS OpenDocument standards.
* 2006: Oracle releases its own distribution of Red Hat Enterprise Linux.
Novell and Microsoft announce cooperation for a better interoperability
and mutual patent protection.
* 2007: Dell starts distributing laptops with Ubuntu pre-installed on them.
* 2009: RedHat's market capitalization equals Sun's, interpreted as a symbolic
moment for the "Linux-based economy".^[53]
* 2011:Version 3.0 of the Linux kernel is released.
* 2012: The aggregate Linux server market revenue exceeds that
of the rest of the Unix market.^[54]
* 2013: Google's Linux-based Android claims 75% of the smartphone
market share, in terms of the number of phones shipped.^[55]
* 2014: Ubuntu claims 22,000,000 users.^[56]
* 2015:Version 4.0 of the Linux kernel is released.
1. Introduction to Linux
GNU Project / GPL Licensing
Evolution of Linux & Development Model
●
●
Device Identities in Linux - Partitioning
Schema
●
Device comes in two flavours:
- A character device represents a hardware device that reads or writes a serial stream of
data bytes. Serial and parallel ports, tape drives, terminal devices, and sound cards.
-A block device represents a hardware device that reads or write data in fixed size
blocks.unlike a character device, a block device provides random access to data stored
on the device.a disk drive is an example of a block device.
Linux identifies devices using two numbers:the major device number and the minor
device number.
Major device number generally identifies a driver where as minor number identifies
devices controlled by the driver.so actual device is identified as major:minor
combination. A device can be master and slave. master are identified with 1,2,3... and
slaves as 65,66,67...
For each device there is a device file or device entry in the file system.cp rm mv
commands works on device file as regular file.data transfer happens from actual device
through device driver. use mknod to create file entry for the device.
$mknod ./lp0 c 6 0
lp0 - path to the device file
c - character device, b for block device
6 - major device number, driver id
0 - minor master device number
$ls -l lp0
crw-r----- 1 root root 6, 0 Mar 7 17:03 lp0
#include <stdio.h>
int main(int argc, char *argv[]){
stat("lp0")
printf("file type \n");
printf("major file number \n");
printf('minor file number \n");
return 0;
}
Day 1 - Morning
2. Introduction to Kernel
●History of Linux
Types of Kernel
The Linux kernel
Kernel Architecture
●
●
●
Histor y
- UNIX: 1969 Thompson & Ritchie AT&T Bell Labs.
- BSD: 1978 Berkeley Software Distribution.
- Commercial Vendors: Sun, HP, IBM, SGI, DEC.
- GNU: 1984 Richard Stallman, FSF.
- POSIX: 1986 IEEE Portable Operating System unIX.
- Minix: 1987 Andy Tannenbaum.
- SVR4: 1989 AT&T and Sun.
- Linux: 1991 Linus Torvalds Intel 386 (i386).
- Open Source: GPL.
Linux Features
- UNIX-like operating system
- Features:
- Preemptive multitasking.
- Virtual memory (protected memory, paging).
- Shared libraries.
- Demand loading, dynamic kernel modules.
- Shared copy-on-write executables.
- TCP/IP networking.
- SMP support.
- Open source.
What's a Ker nel?
- AKA: executive, system monitor.
- Controls and mediates access to hardware.
- Implements and supports fundamental abstractions:
- Processes, files, devices etc.
- Schedules / allocates system resources:
- Memory, CPU, disk, descriptors, etc.
- Enforces security and protection.
- Responds to user requests for service (system calls).
- Etc…etc…
Ker nel Design Goals
- Performance: efficiency, speed.
- Utilize resources to capacity with low overhead.
- Stability: robustness, resilience.
- Uptime, graceful degradation.
- Capability: features, flexibility, compatibility.
- Security, protection.
- Protect users from each other & system from bad
users.
- Portability.
- Extensibility.
Ker nel Design Goals
2. Introduction to Kernel
History of Linux
●
Types of Kernel
●
The Linux kernel
Kernel Architecture
●
●
Types of Ker nel
- Monolithic.
- Layered.
- Modularized.
- Micro-kernel.
- Virtual machine.
A monolithic kernel is a kernel where all services (file system,VFS, device drivers, etc)
as well as core functionality (scheduling, memory allocation, etc.) are a tight knit
group sharing the same space. This directly opposes a microkernel.
A monolithic kernel is a kernel architecture where the entire operating system is
working in the kernel space and alone as supervisor mode. In difference with other
architectures,1 the monolithic kernel defines alone a high-level virtual interface over
computer hardware, with a set of primitives or system calls to implement all operating
system services such as process management, concurrency, and memory management
itself and one or more device drivers as modules.
A microkernel prefers an approach where core functionality is isolated from system
services and device drivers (which are basically just system services). For instance,VFS
(virtual file system) and block device file systems (i.e. minixfs) are separate processes
that run outside of the kernel's space, using IPC to communicate with the kernel,
other services and user processes. In short, if it's a module in Linux, it's a service in a
microkernel, indicating an isolated process.
Recent versions of Windows on the other hand use a Hybrid kernel.
A hybrid kernel is a kernel architecture based on combining aspects of microkernel
and monolithic kernel architectures used in computer operating systems. The category
is controversial due to the similarity to monolithic kernel; the term has been dismissed
by some as simple marketing. The traditional kernel categories are monolithic kernels
and microkernels (with nanokernels and exokernels seen as more extreme versions of
microkernels).
2. Introduction to Kernel
History of Linux
Types of Kernel
●
●
The Linux kernel
●
Kernel Architecture
●
The Linux Ker nel
- Monolithic.
2. Introduction to Kernel
History of Linux
Types of Kernel
The Linux kernel
●
●
●
Kernel Architecture
●
Linux Source Tree Layout
linux/arch
- Subdirectories for each current port.
- Each contains kernel, lib, mm, boot and other directories whose
contents override code stubs in architecture independent code.
- lib directory contains highly-optimized common utility routines
such as memcpy, checksums, etc.
- arch directory as of 2.4:
- alpha, arm, i386, ia64, m68k, mips, mips64.
- ppc, s390, sh, sparc, sparc64.
linux/dr iver s
- Largest amount of code in the kernel tree (~1.5M).
- device, bus, platform and general directories.
- drivers/char – n_tty.c is the default line discipline.
- drivers/block – elevator.c, genhd.c, linear.c, ll_rw_blk.c, raidN.c.
- drivers/net –specific drivers and general routines Space.c and
net_init.c.
- drivers/scsi – scsi_*.c files are generic; sd.c (disk), sr.c (CDROM), st.c (tape), sg.c (generic).
- General:
- cdrom, ide, isdn, parport, pcmcia, pnp, sound, telephony,
video.
- Buses – fc4, i2c, nubus, pci, sbus, tc, usb.
- Platforms – acorn, macintosh, s390, sgi.
linux/fs
- Contains:
- virtual filesystem (VFS) framework.
- subdirectories for actual filesystems.
- vfs-related files:
- exec.c, binfmt_*.c - files for mapping new process images.
- devices.c, blk_dev.c – device registration, block device
support.
- super.c, filesystems.c.
- inode.c, dcache.c, namei.c, buffer.c, file_table.c.
- open.c, read_write.c, select.c, pipe.c, fifo.c.
- fcntl.c, ioctl.c, locks.c, dquot.c, stat.c.
linux/include
- include/asm-*:
- Architecture-dependent include subdirectories.
- include/linux:
- Header info needed both by the kernel and user apps.
- Usually linked to /usr/include/linux.
- Kernel-only portions guarded by #ifdefs
- #ifdef __KERNEL__
/* kernel stuff */
- #endif
- Other directories:
- math-emu, net, pcmcia, scsi, video.
linux/init
- Just two files: version.c, main.c.
- version.c – contains the version banner that prints at
boot.
- main.c – architecture-independent boot code.
- start_kernel is the primary entry point.
linux/ipc
- System V IPC facilities.
- If disabled at compile-time, util.c exports stubs that
simply return –ENOSYS.
- One file for each facility:
- sem.c – semaphores.
- shm.c – shared memory.
- msg.c – message queues.
linux/ker nel
- The core kernel code.
- sched.c – “the main kernel file”:
- scheduler, wait queues, timers, alarms, task queues.
- Process control:
- fork.c, exec.c, signal.c, exit.c etc…
- Kernel module support:
- kmod.c, ksyms.c, module.c.
- Other operations:
- time.c, resource.c, dma.c, softirq.c, itimer.c.
- printk.c, info.c, panic.c, sysctl.c, sys.c.
linux/lib
- kernel code cannot call standard C library routines.
- Files:
- brlock.c – “Big Reader” spinlocks.
- cmdline.c – kernel command line parsing routines.
- errno.c – global definition of errno.
- inflate.c – “gunzip” part of gzip.c used during boot.
- string.c – portable string code.
- Usually replaced by optimized, architecturedependent routines.
- vsprintf.c – libc replacement.
linux/mm
- Paging and swapping:
- swap.c, swapfile.c (paging devices), swap_state.c (cache).
- vmscan.c – paging policies, kswapd.
- page_io.c – low-level page transfer.
- Allocation and deallocation:
- slab.c – slab allocator.
- page_alloc.c – page-based allocator.
- vmalloc.c – kernel virtual-memory allocator.
- Memory mapping:
- memory.c – paging, fault-handling, page table code.
- filemap.c – file mapping.
- mmap.c, mremap.c, mlock.c, mprotect.c.
linux/scr ipts
- Scripts for:
- Menu-based kernel configuration.
- Kernel patching.
- Generating kernel documentation.
Day 1 - Morning
3. Shell Commands & Shell Scripting
●Basic Shell commands
●Bash Shell Essentials
- Introduction
- Process
- Redirection
- Shell Programming
- Programming Commands
- Advance Shell Programming
- Function
- Array
- I/O Redirection and file descriptor
- Local and Global variables
- Conditional Execution
●Creating Makefiles
Shell str ucture
Shell scripting has four components
1) Kernel
2) Shell Process
3) Command Process
4) Redirectors, Pipes, Filters etc.
Kernel does
- I/O management
- Process management
- File management
- Memory management
-----------------------------------| User | ------> | Linux Shell | ---------> | Kernel |
------------------------|
V
--------------------| command process |
----------------------
Shells
NOTE: To find your shell type following command
$ echo $SHELL
Linux Common Commands
$ date --help
$ ls --help | more
Syntax: command-name --help
Syntax: man command-name
Syntax: info command-name
$ man ls
$ info bash
NOTE: In MS-DOS, you get help by using /? clue or by typing help command as
C:\> dir /?
C:\> date /?
C:\> help time
C:\> help date
C:\> help
Linux Command
$ date
$ who
$ pwd
$ ls
$ cat > myfile
$ more myfile
$ mv sales
$ ln Page1 Book1
$ rm myfile
$ rm -rf oldfiles
$ chmod u+x,g+wx,o+x myscript
$ mail
$ who am i
$ logout
$ mail ashish
$wc myfile
$ grep fox
$sort myfile
$tail +5 myfile
$cmp myfile
$pr myfile
3. Shell Commands & Shell Scripting
●Basic Shell commands
●Bash Shell Essentials
- Introduction
- Process
- Redirection
- Shell Programming
- Programming Commands
- Advance Shell Programming
- Function
- Array
- I/O Redirection and file descriptor
- Local and Global variables
- Conditional Execution
●Creating Makefiles
Process
A process is program (command given by user) to perform some Job. In Linux when
you start process, it gives a number (called PID or process-id), PID starts from 0 to
65535.
$ ls -lR , is command or a request to list files in a directory and all sub directory in
your current
directory.
Why Process required
Linux is multi-user, multitasking o/s. It means you can run more than two process
simultaneously if you wish. For e.g.. To find how many files do you have on your
system you may give command like
$ ls / -R | wc -l
This command will take lot of time to search all files on your system. So you can
run such command in Background or simultaneously by giving command like
$ ls / -R | wc -l &
The ampersand (&) at the end of command tells shells start command (ls / -R | wc
-l) and run it in background takes next command immediately. An instance of
running command is called process and the number printed by shell is called
process-id (PID), this PID can be use to refer specific running process.
3. Shell Commands & Shell Scripting
●Basic Shell commands
●Bash Shell Essentials
- Introduction
- Process
- Redirection
- Shell Programming
- Programming Commands
- Advance Shell Programming
- Function
- Array
- I/O Redirection and file descriptor
- Local and Global variables
- Conditional Execution
●Creating Makefiles
Redirection of Standard output/input or Input – Output redirection
(1) > Redirector Symbol (Truncate to zero and write)
Syntax: Linux-command > filename
$ ls > myfiles
(2) >> Redirector Symbol (Append)
Syntax: Linux-command >> filename
$ date >> myfiles
(3) < Redirector Symbol
Syntax: Linux-command < filename
To take input to Linux-command from file instead of key-board. For e.g. To take
input for cat command give
$ cat < myfiles
Pipes
A pipe is a way to connect the output of one program to the input of another
program without any temporary file.
A pipe is nothing but a temporary storage place where the output of one command
is stored and then passed as the input for second command. Pipes are used to run
more than two commands ( Multiple commands) from same command line.
Syntax: command1 | command2
Filter
A filter command takes input from a pipe and constricts the output of the previous
program.
$ tail +20 < hotel.txt | head -n30 >hlist
Here head is filter which takes its input from tail command (tail command start
selecting from line number 20 of given file i.e. hotel.txt) and passes this lines to
input to head, whose output is redirected to 'hlist' file.
3. Shell Commands & Shell Scripting
●Basic Shell commands
●Bash Shell Essentials
- Introduction
- Process
- Redirection
- Shell Programming
- Programming Commands
- Advance Shell Programming
- Function
- Array
- I/O Redirection and file descriptor
- Local and Global variables
- Conditional Execution
●Creating Makefiles
Introduction to Shell Prog ramming
Shell program is series of Linux commands.
Var iables in Linux
Sometimes to process our data/information, variables are remembered by shell
Process.
1) System variables - Created and maintained by Linux itself. This type of variable
defined in CAPITAL LETTERS.
2) User defined variables (UDV) - Created and maintained by user. This type of
variable defined in lower LETTERS.
$ echo $USERNAME
$ echo $HOME
Caution: Do not modify System variable this can some time create problems.
User Def ined Var iable
Syntax: variablename=value
NOTE: Here 'value' is assigned to given 'variablename' and Value must be on right side = sign For
e.g.
$ no=10
# this is ok
$ 10=no
# Error, NOT Ok,Value must be on right side of = sign.
To define variable called 'vech' having value Bus
$ vech=Bus
To define variable called n having value 10
$ n=10
You can define NULL variable as follows (NULL variable is variable which has no value at the time
of definition) For e.g.
$ vech=
$ vech=""
Try to print it's value $ echo $vech , Here nothing will be shown because variable has no value i.e.
NULL variable.
To print or access variables use following syntax
Syntax: $variablename
For eg. To print contains of variable 'vech'
$ echo $vech
How to Run Shell Scr ipts
(1) Use chmod command as follows to give execution permission to our script
Syntax: chmod +x shell-script-name
OR Syntax: chmod 777 shell-script-name
(2) Run our script as
Syntax: ./your-shell-program-name
For e.g.
$ ./first
OR /bin/sh your-shell-program-name
For e.g.
$ bash first
$ /bin/sh first
Script file name complete path is required OR PATH variable needs to be set.
To run the script, file name complete path is required
OR PATH variable needs to be set.
Commands Related with Shell Prog ramming
(1)echo [options] [string, variables...]
Displays text or variables value on screen.
Options
-n Do not output the trailing new line.
-e Enable interpretation of the following backslash escaped characters in the strings:
\a alert (bell)
\b backspace
\c suppress trailing new line
\n new line
\r carriage return
\t horizontal tab
\\ backslash
For eg. $ echo -e "An apple a day keeps away \a\t\tdoctor\n"
(2)More about Quotes
There are three types of quotes
" i.e. Double Quotes
' i.e. Single quotes
` i.e. Back quote
1."Double Quotes" - Anything enclose in double quotes removed meaning of that
characters (except
\ and $).
2. 'Single quotes' - Enclosed in single quotes remains unchanged.
3. `Back quote` - To execute command.
For eg.
$ echo "Today is date"
Can't print message with today's date.
$ echo "Today is `date`".
Now it will print today's date as, Today is Tue Jan ....,See the `date` statement uses
back quote,
(See also Shell Arithmetic NOTE).
3) Shell Arithmetic
Use to perform arithmetic operations For e.g.
$ expr 1 + 3
$ expr 2 - 1
$ expr 10 / 2
$ expr 20 % 3 # remainder read as 20 mod 3 and remainder is 2)
$ expr 10 \* 3 # Multiplication use \* not * since its wild card)
$ echo `expr 6 + 3`
For the last statement note the following points
1) First, before expr keyword we used ` (back quote) sign not the (single quote i.e. ') sign. Back
quote is generally found on the key under tilde (~) on PC keyboards OR To the above of TAB key.
2) Second, expr is also end with ` i.e. back quote.
3) Here expr 6 + 3 is evaluated to 9, then echo command prints 9 as sum
4) Here if you use double quote or single quote, it will NOT work, For eg.
$ echo "expr 6 + 3" # It will print expr 6 + 3
$ echo 'expr 6 + 3'
Command Line arguments
$ myshell f oo bar
Command Line arguments or Function arguments
Exit Status
By default in Linux if particular command is executed, it return two type of values,
if return value is zero (0), command is successful
If return value is nonzero (>0), command is not successful or some sort of error
executing command/shell script.
This value is know as Exit Status of that command.
To determine this exit Status we use $? variable of shell. For eg.
$ rm unknow1file
rm: cannot remove `unkowm1file': No such file or directory
and after that if you give command $ echo $?
it will print nonzero value(>0) to indicate error. Now give command
$ ls
$ echo $?
It will print 0 to indicate command is successful.
3. Shell Commands & Shell Scripting
●Basic Shell commands
●Bash Shell Essentials
- Introduction
- Process
- Redirection
- Shell Programming
- Programming Commands
- Advance Shell Programming
- Function
- Array
- I/O Redirection and file descriptor
- Local and Global variables
- Conditional Execution
●Creating Makefiles
If-then-fi for decision making is shell scr ipt
$ bc
follows type 5 + 2 as
5+2
7
7 is response of bc i.e. addition of 5 + 2 you can even try
5-2
5/2
Now what happened if you type 5 > 2 as follows
5>2
0
Syntax:
if condition
then
command1 if condition is true or if exit status
of condition is 0 (zero)
...
...
fi
$ cat > showf ile
#!/bin/sh
#
#Script to print file
#
if cat $1
then
echo -e "\n\nFile $1, found and successfully echoed"
fi
test command or [ expr ]
test command or [ expr ] is used to see if an expression is true, and if it is true
it return zero(0),
otherwise returns nonzero(>0) for false. Syntax: test expression OR
[ expression ]
Now will write script that determine whether given argument number is
positive. Write script as follows
$ cat > ispostive
#!/bin/sh
#
# Script to see whether argument is positive
#
if test $1 -gt 0
then
echo "$1 number is positive"
fi
Or
$ cat > ispostive
#!/bin/sh
#
# Script to see whether argument is positive
#
If [ test $1 -gt 0 ]
then
echo "$1 number is positive"
fi
test or [ expr] works with
1.Integer ( Number without decimal point)
2.File types
3.Character strings
For Mathematics use following operator in Shell Script
NOTE: == is equal, != is not equal.
For string Comparisons use
Shell also test for file and directory types
if ...else...fi
If given condition is true then command1 is executed otherwise command2 is executed.
Syntax:
if condition
then
command1 if condition is true or if exit status
of condition is 0(zero)
...
...
else
command2 if condition is false or if exit status
of condition is >0 (nonzero)
...
...
fi
$ cat > isnump_n
#!/bin/sh
# Script to see whether argument is positive or negative
#
if [ $# -eq 0 ]
then
echo "$0 :You must give/supply one integers"
exit 1
fi
if test $1 -gt 0
then
echo "$1 number is positive"
else
echo "$1 number is negative"
fi
Multilevel if-then-else
Syntax:
if condition
then
condition is zero (true - 0)
execute all commands up to elif statement
elif condition1
condition1 is zero (true - 0)
execute all commands up to elif statement
elif condition2
condition2 is zero (true - 0)
execute all commands up to elif statement
else
None of the above condtion,condtion1,condtion2 are true (i.e.
all of the above nonzero or false)
execute all commands up to fi
fi
for loop Syntax:
for { variable name } in { list }
do
execute one for each item in the list until the list is
not finished (And repeat all statement between do and done)
done
Suppose,
$ cat > testfor
for i in 1 2 3 4 5
do
echo "Welcome $i times"
done
Run it as,
$ chmod +x testfor
$ ./testfor
while loop
Syntax:
while [ condition ]
do
command1
command2
command3
..
....
done
$cat > nt1
#!/bin/sh
#Script to test while statement
if [ $# -eq 0 ]
then
echo "Error - Number missing form command line argument"
echo "Syntax : $0 number"
echo " Use to print multiplication table for given number"
exit 1
fi
n=$1
i=1
while [ $i -le 10 ]
do
echo "$n * $i = `expr $i \* $n`"
i=`expr $i + 1`
done
The case Statement
The case statement is good alternative to Multilevel if-then-else-fi statement. It enable you to
match several values against one variable. Its easier to read and write.
Syntax:
case $variable-name in
pattern1)
command
..
command;;
pattern2)
command
..
command;;
patternN) command
..
command;;
*)
command
..
command;;
esac
The $variable-name is compared against the patterns until a match is found. The shell then
executes all the statements up to the two semicolons that are next to each other. The default
is *) and its executed if no match is found. For eg. Create script as follows
$ cat > car
#
# if no vehicle name is given
# i.e. -z $1 is defined and it is NULL
#
# if no command line arg
if [ -z $1 ]
then
rental="*** Unknown vehicle ***"
elif [ -n $1 ]
then
# otherwise make first arg as rental
rental=$1
fi
case $rental in
"car") echo "For $rental Rs.20 per k/m";;
"van") echo "For $rental Rs.10 per k/m";;
"jeep") echo "For $rental Rs.5 per k/m";;
"bicycle") echo "For $rental 20 paisa per k/m";;
*) echo "Sorry, I can not gat a $rental for you";;
esac
Save it by pressing CTRL+D
$ chmod +x car
$ car van
$ car car
$ car Maruti-800
The read Statement
Use to get input from keyboard and store them to variable.
Syntax: read varible1, varible2,...varibleN
Create script as
$ cat > sayH
#
#Script to read your name from key-board
#
echo "Your first name please:"
read fname
echo "Hello $fname, Lets be friend!"
Run it as follows
$ chmod +x sayH
$ ./sayH
Filename Shor thand or meta Character s (i.e. wild cards)
* or ? or [...] is one of such shorthand character.
* Matches any string or group of characters.
For e.g. $ ls * , will show all files, $ ls a* - will show all files whose first name is starting with
letter
'a', $ ls *.c ,will show all files having extension .c $ ls ut*.c, will show all files having
extension .c
but first two letters of file name must be 'ut'.
? Matches any single character.
For e.g. $ ls ? , will show one single letter file name, $ ls fo? , will show all files whose names
are 3
character long and file name begin with fo
[...] Matches any one of the enclosed characters.
For e.g. $ ls [abc]* - will show all files beginning with letters a,b,c
[..-..] A pair of characters separated by a minus sign denotes a range;
For eg. $ ls /bin/[a-c]* - will show all files name beginning with letter a,b or c like
/bin/arch
/bin/awk
/bin/bsh
/bin/chmod
/bin/cp
/bin/ash
/bin/basename
/bin/cat
/bin/chown
/bin/cpio
/bin/ash.static
/bin/bash
/bin/chgrp
/bin/consolechars
/bin/csh
But
$ ls /bin/[!a-o]
$ ls /bin/[^a-o]
command1;command2
To run two command with one command line.For eg. $ date;who ,Will print today's date
followed
http://www.freeos.com/guides/lsst/shellprog.htm (18 of 19) [17/08/2001 17.42.21]
Linux Shell Script Tutorial
by users who are currently login.
3. Shell Commands & Shell Scripting
●Basic Shell commands
●Bash Shell Essentials
- Introduction
- Process
- Redirection
- Shell Programming
- Programming Commands
- Advance Shell Programming
- Function
- Array
- I/O Redirection and file descriptor
- Local and Global variables
- Conditional Execution
Creating Makefiles
●
/dev/null - Use to send unwanted output of prog ram
Syntax: command > /dev/null
For e.g. $ ls > /dev/null , output of this command is not shown on screen its send to this
special file. The /dev directory contains other device files. The files in this directory mostly
represent peripheral devices such disks liks floppy disk, sound card, line printers etc.
local and Global Shell var iable (expor t command)
Normally all our variables are local. Local variable can be used in same shell, if you load
another copy of shell (by typing the /bin/bash at the $ prompt) then new shell ignored all
old shell's variable. For e.g.
Consider following example
$ vech=Bus
$ echo $vech
Bus
$ /bin/bash
$ echo $vech
NOTE:-Empty line printed
$ vech=Car
$ echo $vech
Car
$ exit
$ echo $vech
Syntax: expor t var iable1, var iable2,.....var iableN
For e.g.
$ vech=Bus
$ echo $vech
Bus
$ export vech
$ /bin/bash
$ echo $vech
Bus
$ exit
$ echo $vech
Conditional execution i.e. && and ||
The control operators are && (read as AND) and || (read as OR). An AND list has the
Syntax: command1 && command2
Here command2 is executed if, and only if, command1 returns an exit status of zero. An
OR list has the
Syntax: command1 || command2
Here command2 is executed if and only if command1 returns a non-zero exit status.You
can use both as follows
command1 && comamnd2 if exist status is zero || command3 if exit status is non-zero
Here if command1 is executed successfully then shell will run command2 and if
command1 is not successful then command3 is executed. For e.g.
$ rm myf && echo File is removed successfully || echo File is not removed
If file (myf) is removed successful (exist status is zero) then "echo File is removed
successfully" statement is executed, otherwise "echo File is not removed" statement is
executed (since exist status is non-zero)
Functions
Function is series of instruction/commands. Function performs particular activity in shell.
To define function use following
Syntax:
function-name ( )
{
command1
command2
.....
...
commandN
return
}
Where function-name is name of you function, that executes these commands. A return
statement will terminate the function. For e.g. Type SayHello() at $ prompt as follows
$ SayHello()
{
echo "Hello $LOGNAME, Have nice computing"
return
}
$ SayHello
Hello xxxxx, Have nice computing
Edit /etc/bashrc (as root) or ~/.bashrc for executing function at login time.
I/O Redirection and file descr iptor s
$ cat > myf
This is my file
^D
Above command send output of cat command to myf file. Redirection can be used to
send output to stderr, stdout and can be used to read input for stdin files
[[email protected] ~]$ rm > tmp1
r m: missing operand
Tr y `r m --help' for more infor mation.
[[email protected] ~]$ cat tmp1
[[email protected] ~]$ r m > tmp1 2>&1
[[email protected] ~]$ cat tmp1
r m: missing operand
Tr y `r m --help' for more infor mation.
[[email protected] ~]$
Ar ray
Arrays are define as
ar=(one two three)
for i in 2 4 5 6; do
done
for i in {1..6}; do
done
${ar[1]} ${ar[2]} ...
$ar[*] or $ar[@] # for list
${#ar[*]} # for number of elements
for i in ${ar[@]}; do
done
3. Shell Commands & Shell Scripting
●Basic Shell commands
●Bash Shell Essentials
- Introduction
- Process
- Redirection
- Shell Programming
- Programming Commands
- Advance Shell Programming
- Function
- Array
- I/O Redirection and file descriptor
- Local and Global variables
- Conditional Execution
Creating Makefiles
●
Constituents of a make file
• Rules
• Variables
• Directives
– Inclusion of another make
– Conditional directives
• Comments
– Text that follows # symbol is treated as comment
– To include # literally, prefix with \
Rules
Syntax
target1 [target2…] : [prerequisite1] [prerequisite2 …]
<TAB>command-1
<TAB>command-2
• Explicit rule
– explicitly specify the prerequisites for a specific target
• Implicit rules
– Take advantage of the knowledge make has about known patterns of files (e.g.,
.c, .cpp .o, .s …)
– Further classified into pattern rules & suffix rules
Variables
Predefined
• Some commonly used variables predefined by GNU make CC , FLAGS ,
CFLAGS, LDFLAGS, [email protected], $^, $<
[email protected] name of the target
foo1.o: foo1.c foo1.h
$< name of the first prerequisite
gcc -c $<
$^
names of all prerequisites
foo: foo1.o foo2.o
gcc -o [email protected] $^
foo: foo1.o foo2.o
gcc -o foo foo1.o foo2.o
foo2.o: foo2.c foo2.h
gcc -c $<
foo2.o: foo2.c foo2.h foo1.h
gcc -c foo2.c
User defined
ABC:=10 # const assigment
ABC=10 # non const assignment
Command line variables
Variables can be defined or redefined from command line
$ make
$ make VAR1=abc VAR2=xyz
Use override directive to let undesirable command line redefines for a variable be
ignored
ex.
VAR1=dummy
VAR2=
All:
echo VAR1 = $(VAR1)
echo VAR2 = override $(VAR2)
VAR1=dummy
Conditional assingment
ARCH ?= x86
Append
SRC += x.c
Implicit rules
Wildcard
foo: *.o
gcc -o [email protected] $^X
Functions
General syntax
$(function-name arg1[,argn])
SRC := x.c y.c z.c
• String functions
– $(subst search-str,replace-str,text)
OBJS := $(subst .c,.o,$(SRC))
– $(patsubst search-pat,replace-pat,text)
OBJS := $(patsubst %.c, obj/%.o, x.c y.c z.c)
• Warning function
– Very useful for debugging
– Can be placed anywhere in a makefile
$(warning TARGET not defined)
outputs in the format
<filename>:<linenum>:TARGET not defined
• Shell function
– Can be used to invoke any external program
today := $(shell date)
Wildcard function
SRCS := $(wildcard *.c)
OBJS := $(subst .c,.o,$(SRCS))
foo: $(OBJS)
gcc -c $< cc -o [email protected] $^
foo2.o: foo2.c foo2.h foo1.h
gcc -c $<
foo1.o: foo1.c foo1.h
gcc -c $<
Pattern rule
foo : foo1.o foo2.o
g++ -o [email protected] $^
foo2.o: foo2.h foo1.h
foo1.o: foo1.h
# pattern rule for .cpp to .o
%.o : %.cpp
g++ -c $<
More advanced
%.o:%.c
$(COMPILE.c) $(OUTPUT_OPTION) $<
where
COMPILE.c =$(CC) $(CFLAGS) $(CPPFLAGS ) $(TARGET_ARCH) -c
CC =cc
OUTPUT_OPTION =-o [email protected]
conditionals
conditional-directive
text-if-true
endif
conditional-directive
text-if-true
else
text-if-false
endif
Conditional directives
- ifeq
- ifneq
- ifdef variable-name
- ifndef variable-name
Day 2 Morning
4. Creating Libraries
Creating Static Library
- Using Static Library
Creating Shared Library
●
●
- Using Shared Library
Dynamic Loading and Unloading
This functionality is available under Linux by using the dlopen function.
dlopen (“libtest.so”, RTLD_LAZY)
The second parameter is a flag that indicates how to bind symbols in the shared
Library.
Include the <dlfcn.h> header file and link with the –ldl option to pick up the
libdl library.
void* handle = dlopen (“libtest.so”, RTLD_LAZY);
void (*test)() = dlsym (handle, “my_function”);
(*test)();
dlclose (handle);
Both dlopen and dlsym return NULL if they do not succeed. In that event, you
can call dlerror (with no parameters) to obtain a human-readable error message
describing the problem.
C++ file linking to C shared librar y
If you’re writing the code in your shared library in C++, you will probably want
to declare those functions and variables that you plan to access elsewhere with the
extern “C” linkage specifier.
extern “C” void foo ();
This prevents the C++ compiler from mangling the function name, which would
change the function’s name from foo to a different, funny-looking name that encodes
extra information about the function. A C compiler will not mangle names; it will use
whichever name you give to your function or variable.
Day 2 Morning
5. The Boot Process
BIOS Level
- Boot Loader – Setup
- startup_32 functions
●The start_kernel() function
●
Linux Boot flow
Booting Sequence
1. Tern on
2. CPU jump to address of BIOS (0xFFFF0)
3. BIOS runs POST (Power-On Self Test)
4. Find bootale devices
5. Loads and execute boot sector form MBR
6/ Load OS
5. The Boot Process
BIOS Level
●
- Boot Loader – Setup
- startup_32 functions
●The start_kernel() function
BIOS refers to the software code run by a computer when first powered on
The primary function of BIOS is code program embedded on a chip that recognizes
and controls various devices that make up the
computer.
<=BOIS on BOARD
BIOS on SCREEN=>
MBR Master Boot Record
- OS is booted from a hard disk, where the Master Boot Record (MBR) contains the
primary boot loader
- The MBR is a 512-byte sector, located in the first sector on the disk (sector 1 of
cylinder 0, head 0)
- After the MBR is loaded into RAM, the BIOS yields control to it.
MBR, Master Boot Record
- The first 446 bytes are the primary boot loader, which contains both executable
code and error message text
- The next sixty-four bytes are the partition table, which contains a record for each of
four partitions
- The MBR ends with two bytes that are defined as the magic number (0xAA55).
The magic number serves as a validation check of the MBR
Extract MBR, Master Boot Record
# dd if=/dev/hda of=mbr.bin bs=512 count=1
# od -xa mbr.bin
5. The Boot Process
BIOS Level
●
- Boot Loader – Setup
- startup_32 functions
●The start_kernel() function
Boot Loader
- Boot loader or kernel loader first decompress kernel zImage file then calls kernel
start_kernel() function passing the arguments.
- Optional, initial RAM disk
- GRUB and LILO are the most popular Linux boot loader.
List of Boot loaders
bootman, GRUB, LILO, NTLDR, XOSL, BootX, loadlin, Gujin, Boot Camp,
Syslinux, GAG
x
GRUB Boot Loader
- GRUB is an operating system independent boot loader
- A multi-boot software packet from GNU
- Flexible command line interface
- File system access
- Support multiple executable format
- Support disk less system
- Download OS from network
GRUB Boot Process
1. The BIOS finds a bootable device (hard disk) and transfers control to the master
boot record
2. The MBR contains GRUB stage 1. Given the small size of the MBR, Stage 1 just
load the next stage of GRUB
3. GRUB Stage 1.5 is located in the first 30 kilobytes of hard disk immediately
following the MBR. Stage 1.5 loads Stage 2.
4. GRUB Stage 2 receives control, and displays to the user the GRUB boot menu
(where the user can manually specify the boot parameters).
5. GRUB loads the user-selected (or default) kernel into memory and passes control
on to the kernel.
GRUB Config File
LILO: LInux LOader
- A versatile boot manager that supports:
- Choice of Linux kernels.
- Boot time kernel parameters.
- Booting non-Linux kernels.
- A variety of configurations.
- Characteristics:
- Lives in MBR or partition boot sector.
- Has no knowledge of filesystem structure so...
- Builds a sector “map file” (block map) to find kernel.
- /sbin/lilo – “map installer”.
- /etc/lilo.conf is lilo configuration file.LILO Boot Loader
lilo.conf
boot=/dev/hda
map=/boot/map
install=/boot/boot.b
prompt
timeout=50
default=linux
image=/boot/vmlinuz-2.2.12-20
label=linux
initrd=/boot/initrd-2.2.12-20.img
read-only
root=/dev/hda1
Kernel Booting, Init process
Kernel execute init(pid 1) program, getting init process.
- Init is the root/parent of all processes executing on Linux
- The first processes that init starts is a script /etc/rc.d/rc.sysinit
- Based on the appropriate run-level, scripts are executed to start various processes to
run the system and make it functional
- Init is responsible for starting system processes as defined in the /etc/inittab file
- Init typically will start multiple instances of "getty" which waits for console logins
which spawn one's user shell process
- Upon shutdown, init controls the sequence and processes for shutdown
Process ID Description
0
The Scheduler
1
The init process
2
kflushd
3
kupdate
4
kpiod
5
kswapd
6
mdrecoveryd
Linux files structure
Day 2 Morning
6. The File System
Virtual File system & its role
Files associated with a process
System Calls
●
●
●
The File System
Filesystems are containers of files, that are stored, probably in a directory tree,
together with attributes, like size, owner, creation date and the like. A filesystem has a type.
It defines how things are arranged on the disk. For example, one has the types minix, ext2,
reiserfs, iso9660, vfat, hfs.
Linux File System Layout
Inode and direntry
$mkdir testdir
Inode
An (in-core) inode contains the metadata of a file: its serial number, its protection
(mode), its owner, its size, the dates of last access, creation and last modification, etc. It
also points to the superblock of the filesystem the file is in, the methods for this file,
and the dentries (names) for this file.
struct inode {
unsigned long i_ino;
umode_t i_mode;
uid_t i_uid;
gid_t i_gid;
kdev_t i_rdev;
loff_t i_size;
struct timespec i_atime;
struct timespec i_ctime;
struct timespec i_mtime;
struct super_block *i_sb;
struct inode_operations *i_op;
struct address_space *i_mapping;
struct list_head i_dentry;
...
}
User space stat structure provides similar interface
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
int stat (const char *path, struct stat *buf);
int fstat (int fd, struct stat *buf);
int lstat (const char *path, struct stat *buf);
truct stat {
dev_t st_dev; /*ID of device containing file */
ino_t st_ino; /*inode number *
mode_t st_mode; /*permissions */
nlink_t st_nlink; /*number of hard links */
uid_t st_uid; /*user ID of owner */
gid_t st_gid; /*group ID of owner */
dev_t st_rdev; /*device ID (if special file) */
off_t st_size; /*total size in bytes */
blksize_t st_blksize; /*blocksize for filesystem I/O */
blkcnt_t st_blocks; /* number of blocks allocated */
time_t st_atime; /*last access time */
time_t st_mtime; /*last modification time */
time_t st_ctime; /*last status change time */
};
lstat() is identical to stat(), except that if pathname is a symbolic
link, then it returns information about the link itself, not the file
that it refers to.
fstat() is identical to stat(), except that the file about which infor‐
mation is to be retrieved is specified by the file descriptor fd.
#include
#include
#include
#include
<sys/types.h>
<sys/stat.h>
<unistd.h>
<stdio.h>
int main (int argc, char *argv[])
{
struct stat sb;
int ret;
if (argc < 2) {
fprintf (stderr,
"usage: %s <file>\n", argv[0]);
return 1;
}
ret = stat (argv[1], &sb);
if (ret) {
perror ("stat");
return 1;
}
printf ("%s is %ld bytes\n",
argv[1], sb.st_size);
return 0;
}
The following mask values are defined for the file type of the st_mode field:
S_IFMT
0170000
S_IFSOCK
S_IFLNK
S_IFREG
S_IFBLK
S_IFDIR
S_IFCHR
S_IFIFO
bit mask for the file type bit field
0140000
0120000
0100000
0060000
0040000
0020000
0010000
socket
symbolic link
regular file
block device
directory
character device
FIFO
Thus, to test for a regular file (for example), one could write:
stat(pathname, &sb);
if ((sb.st_mode & S_IFMT) == S_IFREG) {
/* Handle regular file */
}
#include "apue.h"
int
main(int argc, char *argv[])
{
int
i;
struct stat buf;
char
*ptr;
for (i = 1; i < argc; i++) {
printf("%s: ", argv[i]);
if (lstat(argv[i], &buf) < 0) {
err_ret("lstat error");
continue;
}
if (S_ISREG(buf.st_mode))
ptr = "regular";
else if (S_ISDIR(buf.st_mode))
ptr = "directory";
else if (S_ISCHR(buf.st_mode))
ptr = "character special";
else if (S_ISBLK(buf.st_mode))
ptr = "block special";
else if (S_ISFIFO(buf.st_mode))
ptr = "fifo";
else if (S_ISLNK(buf.st_mode))
ptr = "symbolic link";
else if (S_ISSOCK(buf.st_mode))
ptr = "socket";
else
ptr = "** unknown mode **";
printf("%s\n", ptr);
}
exit(0);
}
Printing all fields
#
#
#
#
#
include <fcntl.h>
include <stdio.h>
include <time.h>
include <sys/types.h>
include<sys/stat.h>
main()
{
}
struct stat fst;
struct tm *Time;
int fd;
fd = open("testfile",O_RDONLY);
fstat(fd,&fst);
printf("Listing the detailsd of the file\n");
printf(" The inode no of the file is %d\n",fst.st_ino);
printf(" The device ID of the file is %d\n",fst.st_dev);
printf(" The block size of the file system is %d\n",fst.st_blksize);
printf("The user ID is %d\n",fst.st_uid);
printf("The group ID is %d\n",fst.st_gid);
printf("Access time is %d\n",fst.st_atime);
printf("creation time is %d\n",fst.st_ctime);
printf("modification time is %d\n",fst.st_mtime);
Time = localtime(&fst.st_atime);
printf("day : %d\n ",Time->tm_mday);
printf("month: %d \n ",Time ->tm_mon);
printf("year : %d \n ",Time->tm_year);
printf("hour : %d \n ",Time->tm_hour);
printf("min : %d \n ",Time ->tm_min);
Permissions
While the stat calls can be used to obtain the permission values for a given file, two other
system calls set those values:
#include <sys/types.h>
#include <sys/stat.h>
int chmod (const char *path, mode_t mode);
int fchmod (int fd, mode_t mode);
Example chmod
int ret;
/*
* Set 'map.png' in the current directory to
* owner-readable and -writable. This is the
* same as 'chmod 600 ./map.png'.
*/
ret = chmod ("./map.png", S_IRUSR | S_IWUSR);
if (ret)
perror ("chmod");
Ownership
In the stat structure, the st_uid and st_gid fields provide the file’s owner and group,
respectively. Three system calls allow a user to change those two values:
#include <sys/types.h>
#include <unistd.h>
int chown (const char *path, uid_t owner, gid_t group);
int lchown (const char *path, uid_t owner, gid_t group);
int fchown (int fd, uid_t owner, gid_t group);
struct group *gr;
int ret;
/*
* getgrnam() returns information on a group
* given its name.
*/
gr = getgrnam ("officers");
if (!gr) {
/* likely an invalid group */
perror ("getgrnam");
return 1;
}
/* set manifest.txt's group to 'officers' */
ret = chown("manifest.txt", -1, gr->gr_gid);
if (ret)
perror ("chown");
Reading a Directory’s Contents
A directory is represented by DIR object
#include <sys/types.h>
#include <dirent.h>
DIR * opendir (const char *name);
To obtain the file descriptor behind a given directory stream:
#define _BSD_SOURCE /* or _SVID_SOURCE */
#include <sys/types.h>
#include <dirent.h>
int dirfd (DIR *dir);
Reading from a directory stream
Once you have created a directory stream with opendir() , your program can begin
reading entries from the directory. To do this, use readdir() , which returns entries one
by one from a given DIR object:
#include <sys/types.h>
#include <dirent.h>
struct dirent * readdir (DIR *dir);
A successful call to readdir() returns the next entry in the directory represented by
dir . The dirent structure represents a directory entry. Defined in <dirent.h> , on
Linux, its definition is:
Applications successively invoke readdir() , obtaining each file in the directory, until
they find the file they are searching for or until the entire directory is read, at which
time readdir() returns NULL .
struct dirent {
ino_t d_ino; /* inode number */
off_t d_off; /* offset to the next dirent */
unsigned short d_reclen; /* length of this record */
unsigned char d_type; /* type of file */
char d_name[256]; /* filename */
};
To close the DIR*
int closedir (DIR *dir);
/*
* find_file_in_dir - searches the directory 'path' for a
* file named 'file'.
*
* Returns 0 if 'file' exists in 'path' and a nonzero
* value otherwise.
*/
int find_file_in_dir (const char *path, const char *file)
{
struct dirent *entry;
int ret = 1;
DIR *dir;
dir = opendir (path);
errno = 0;
while ((entry = readdir (dir)) != NULL) {
if (strcmp(entry->d_name, file) == 0) {
ret = 0;
break;
}
}
if (errno && !entry)
perror ("readdir");
closedir (dir);
return ret;
}
System calls for reading directory contents
The previously discussed functions for reading the contents of directories are standar‐
dized by POSIX and provided by the C library. Internally, these functions use one of
two system calls, readdir() and getdents() , which are provided here for completeness:
#include <unistd.h>
#include <linux/types.h>
#include <linux/dirent.h>
#include <errno.h>
/*
* Not defined for user space: need to
* use the _syscall3() macro to access.
*/
int readdir (unsigned int fd,
struct dirent *dirp,
unsigned int count);
int getdents (unsigned int fd,
struct dirent *dirp,
unsigned int count);
Links
A link is essentially just a name in a list (a directory) that points at an inode—there
would appear to be no reason why multiple links to the same inode could not exist.
That is, a single inode (and thus a single file) could be referenced from, say, both
/etc/customs and /var/run/ledger.
Hard Link
Files can have 0, 1, or many links. Most files have a link count of 1—that is, they are
pointed at by a single directory entry—but some files have 2 or even more links.
These are called hard link.
The link() system call, one of the original Unix system calls, and now standardized by
POSIX, creates a new link for an existing file:
#include <unistd.h>
int link (const char *oldpath, const char *newpath);
int ret;
/*
* create a new directory entry,
* '/home/kidd/privateer', that points at
* the same inode as '/home/kidd/pirate'
*/
ret = link ("/home/kidd/privateer", /home/kidd/pirate");
if (ret)
perror ("link");
Symbolic Links
Symbolic links, also known as symlinks or soft links, are similar to hard links in that
both point at files in the filesystem. The symbolic link differs, however, in that it is not
merely an additional directory entry, but a special type of file altogether. This special
file contains the pathname for a different file, called the symbolic link’s target. At
runtime, on the fly, the kernel substitutes this pathname for the symbolic link’s
pathname (unless using the various l versions of system calls, such as lstat() , which
operate on the link itself, and not the target).
Soft links, unlike hard links, can span filesystems also called dangling softlink..
#include <unistd.h>
int symlink (const char *oldpath, const char *newpath);
int ret;
/*
* create a symbolic link,
* '/home/kidd/privateer', that
* points at '/home/kidd/pirate'
*/
ret = symlink ("/home/kidd/privateer", "/home/kidd/pirate");
if (ret)
perror ("symlink");
Unlinking
The converse to linking is unlinking, the removal of pathnames from the filesystem. A
single system call, unlink() , handles this task:
#include <unistd.h>
int unlink (const char *pathname);
6. The File System
●Virtual File system & its role
Files associated with a process
proc file system
System Calls
●
●
●
VFS,Virtual File Systems
- The Linux kernel implements the concept of Virtual File System (VFS, originally
Virtual Filesystem Switch), so that it is (to a large degree) possible to separate actual
"low-level" filesystem code from the rest of the kernel.
- The VFS is more of an Interface rather than an actual complete file system.
- An important role of the VFS is to perform what is called "Standard Actions". For
example, the function lseek() is not actually implemented by any file system, as the
function of lseek() is provided
by a "standard action" of VFS.
- Two important native filesystems in the Linux environment are ext2 and the proc
file system.
Four main objects in VFS API: superblock, dentries, inodes, files
- The kernel keeps track of files using in-core inodes ("index nodes"), usually
derived by the low-level filesystem from on-disk inodes.
- A file may have several names, and there is a layer of dentries ("directory
entries") that represent pathnames, speeding up the lookup operation.
- Several processes may have the same file open for reading or writing, and file
structures contain the required information such as the current file position.
- Access to a filesystem starts by mounting it. This operation takes a filesystem
type (like ext2, vfat, iso9660, nfs) and a device and produces the in-core
superblock that contains the information required for operations on the
filesystem; a third ingredient, the mount point, specifies what pathname refers
to the root of the filesystem.
Auxiliary objects
We have filesystem types, used to connect the name of the filesystem to the routines
for setting it up (at mount time) or tearing it down (at umount time).
- A struct vfsmount represents a subtree in the big file hierarchy - basically a pair
(device, mountpoint).
- A struct nameidata represents the result of a lookup.
- A struct address_space gives the mapping between the blocks in a file and blocks on
disk. It is needed for I/O.
Filesystem type reg istration
The struct is of type struct file_system_type . Here the 2.2.17 version:
struct file_system_type {
const char *name;
int fs_flags;
struct super_block *(*read_super) (struct super_block *, void *, int);
struct file_system_type *next;
};
The call register_filesystem() hangs this struct in the chain with head
file_systems , and unregister_filesystem() removes it again.
Accesses to this chain are protected by the spinlock file_systems_lock . There
are no other writers. The main reader is of course the mount() system call (via
get_fs_type() ). Other readers are get_filesystem_list() used for /proc/filesystems
, andthe sysfs system call.
The code is in fs/filesystems.c .
static struct file_system_type tue_fs_type = {
.owner= THIS_MODULE,
.name= "tue",
.get_sb= tue_get_sb,
.kill_sb= kill_block_super,
.fs_flags= FS_REQUIRES_DEV,
}
static int __init init_tue_fs(void) {
return register_filesystem(&tue_fs_type);
}
static void __exit exit_tue_fs(void)
{
unregister_filesystem(&tue_fs_type);
}
Struct file_system_type
struct file_system_type {
const char *name;
int fs_flags;
struct super_block *(*get_sb)(struct file_system_type *,
int, char *, void *, struct vfsmount *);
void (*kill_sb) (struct super_block *);
struct module *owner;
struct file_system_type *next;
struct list_head fs_supers;
struct lock_class_key s_lock_key;
struct lock_class_key s_umount_key;
};
(In 2.4 there was no kill_sb() , and the role of get_sb() was taken by
read_super() . The final parameter of get_sb() and the lock_class_key fields are
present since 2.6.18.)
name
Here the filesystem type gives its name ("tue"), so that the kernel can find it when
someone does mount -t tue /dev/foo /dir
get_sb
At mount time the kernel calls the fstype->get_sb() routine that initializes things and
sets up a superblock. Typically this is a 1-line routine that calls one of get_sb_bdev ,
get_sb_single , get_sb_nodev , get_sb_pseudo
kill_sb
At umount time the kernel calls the fstype->kill_sb() routine to clean up. Typically
one of kill_block_super , kill_anon_super , kill_litter_super .
Example of the use of owner - sysfs
There exists a strange SYSV system call sysfs that will return (i) a sequence number
given a filesystem type, and (ii) a filesystem type given a sequence number, and (iii) the
total number of filesystem types registered now. This call is not supported by libc or
glibc.
These sequence numbers are rather meaningless since they may change any moment.
But this means that one can get a snapshot of the list of filesystem types without
looking at /proc/filesystems . For example, the program
#include <stdio.h>
#include <linux/unistd.h>
/* define the 3-arg version of sysfs() */
static _syscall3(int,sysfs,int,option,unsigned int,fsindex,char *,buf);
/* define the 1-arg version of sysfs() */
static int sysfs1(int i) {
return sysfs(i,0,NULL);
}
main(){
int i, tot;
char buf[100];
/* how long is a filesystem type name?? */
tot = sysfs1(3);
if (tot == -1) {
perror("sysfs(3)");
exit(1);
for (i=0; i<tot; i++) {
if (sysfs(2, i, buf)) {
perror("sysfs(2)");
exit(1);
}
printf("%2d: %s\n", i, buf);
}
Return 0;
might give output like
0:ext2
1:minix
2:romfs
3:msdos
4:vfat
5:proc
6:nfs
7:smbfs
8:iso9660
Mounting
The mount system call attaches a filesystem to the big file hierarchy at some indicated
point. Ingredients needed:
(i) a device that carries the filesystem (disk, partition, floppy, CDROM, SmartMedia
card, ...), (ii) a directory where the filesystem on that device must be attached, (iii) a
filesystem type.
The code for sys_mount() is found in fs/namespace.c and fs/super.c . The connection
with the filesystem type name is made in do_kern_mount() :
struct file_system_type *type = get_fs_type(fstype);
struct super_block *sb;
if (!type)
return ERR_PTR(-ENODEV);
sb = type->get_sb(type, flags, name, data);
and this is the only call of the get_sb() routine.
The code for sys_umount() is found in fs/namespace.c and fs/super.c . The
counterpart of the just quoted code is the cleanup in deactivate_super() :
fs->kill_sb(s);
and this is the only call of the kill_sb() routine.
The superblock
The superblock gives global information on a filesystem: the device on which it lives,
its block size, its type, the dentry of the root of the filesystem, the methods it has, etc.,
etc.
struct super_block {
dev_t s_dev;
unsigned long s_blocksize;
struct file_system_type *s_type;
struct super_operations *s_op;
struct dentry *s_root;
...
}
struct super_operations {
struct inode *(*alloc_inode)(struct super_block *sb);
void (*destroy_inode)(struct inode *);
void (*read_inode) (struct inode *);
void (*dirty_inode) (struct inode *);
void (*write_inode) (struct inode *, int);
void (*put_inode) (struct inode *);
void (*drop_inode) (struct inode *);
void (*delete_inode) (struct inode *);
void (*put_super) (struct super_block *);
void (*write_super) (struct super_block *);
int (*sync_fs)(struct super_block *sb, int wait);
void (*write_super_lockfs) (struct super_block *);
void (*unlockfs) (struct super_block *);
int (*statfs) (struct super_block *, struct statfs *);
int (*remount_fs) (struct super_block *, int *, char *);
void (*clear_inode) (struct inode *);
void (*umount_begin) (struct super_block *);
int (*show_options)(struct seq_file *, struct vfsmount *);
};
This is enough to get started: the dentry of the root directory tells us the inode of this
root directory (and in particular its i_ino ), and sb->s_op->read_inode(inode) will
read this inode from disk. Now inode->i_op->lookup() allows us to find names in the
root directory, etc.
Each superblock is on six lists, with links through the fields s_list , s_dirty , s_io ,
s_anon , s_files , s_instances , respectively.
The super_blocks list
All superblocks are collected in a list super_blocks with links in the fields s_list . This
list is protected by the spinlock sb_lock . The main use is in super.c:get_super() or
user_get_super() to find the superblock for a given block device. (Bothroutines are
identical, except that one takes a bdev , the other a dev_t .) This list is also used various
places where all superblocks must be sync'ed or all dirty inodes must be written out.
The fs_super s list
All superblocks of a given type are collected in a list headed by the fs_supers field of
the struct filesystem_type, with links in the fields s_instances . Also this list is protected
by the spinlock sb_lock .
The file list
All open files belonging to a given superblock are chained in a list headed by the
s_files field of the superblock, with links in the fields f_list of the files. These lists are
protected by the spinlock files_lock . This list is used for example in
fs_may_remount_ro() to check that there are no files currently open for writing.
The list of anonymous dentr ies
Normally, all dentries are connected to root. However, when NFS filehandles are used
this need not be the case. Dentries that are roots of subtrees potentially unconnected
to root are chained in a list headed by the s_anon field
of the superblock, with links in the fields d_hash . These lists are protected by the
spinlock dcache_lock . They are grown in dcache.c:d_alloc_anon() and shrunk in
super.c:generic_shutdown_super() .
The inode lists s_dir ty, s_io
Lists of inodes to be written out. These lists are headed at the s_dirty (resp. s_io ) field
of the superblock, with links in the fields i_list . These lists are protected by the
spinlock inode_lock . See fs/fs-writeback.c .
Inodes
An (in-core) inode contains the metadata of a file: its serial number, its protection
(mode), its owner, its size, the dates of last access, creation and last modification, etc. It
also points to the superblock of the filesystem the file is in, the methods for this file,
and the dentries (names) for this file.
struct inode {
unsigned long i_ino;
umode_t i_mode;
uid_t i_uid;
gid_t i_gid;
kdev_t i_rdev;
loff_t i_size;
struct timespec i_atime;
struct timespec i_ctime;
struct timespec i_mtime;
struct super_block *i_sb;
struct inode_operations *i_op;
struct address_space *i_mapping;
struct list_head i_dentry;
...
}
struct inode_operations {
int (*create) (struct inode *, struct dentry *, int);
struct dentry * (*lookup) (struct inode *, struct dentry *);
int (*link) (struct dentry *, struct inode *, struct dentry *);
int (*unlink) (struct inode *, struct dentry *);
int (*symlink) (struct inode *, struct dentry *, const char *);};
int (*mkdir) (struct inode *, struct dentry *, int);
int (*rmdir) (struct inode *, struct dentry *);
int (*mknod) (struct inode *, struct dentry *, int, dev_t);
int (*rename) (struct inode *, struct dentry *, struct inode *, struct dentry *);
int (*readlink) (struct dentry *, char *,int);
int (*follow_link) (struct dentry *, struct nameidata *);
void (*truncate) (struct inode *);
int (*permission) (struct inode *, int);
int (*setattr) (struct dentry *, struct iattr *);
int (*getattr) (struct vfsmount *mnt, struct dentry *, struct kstat *);
int (*setxattr) (struct dentry *, const char *, const void *, size_t, int);
ssize_t (*getxattr) (struct dentry *, const char *, void *, size_t);
ssize_t (*listxattr) (struct dentry *, char *, size_t);
int (*removexattr) (struct dentry *, const char *);
Each inode is on four lists, with links through the fields i_hash , i_list , i_dentry ,
i_devices .
Dentr ies
The dentries encode the filesystem tree structure, the names of the files. Thus, the
main parts of a dentry are the inode (if any) that belongs to it, the name (the final part
of the pathname), and the parent (the name of the containing directory). There are
also the superblocks, the methods, a list of subdirectories, etc.
struct dentry {
struct inode *d_inode;
struct dentry *d_parent;
struct qstr d_name;
struct super_block *d_sb;
struct dentry_operations *d_op;
struct list_head d_subdirs;
...
}
struct dentry_operations {
int (*d_revalidate)(struct dentry *, int);
int (*d_hash) (struct dentry *, struct qstr *);
int (*d_compare) (struct dentry *, struct qstr *, struct qstr *);
int (*d_delete)(struct dentry *);
void (*d_release)(struct dentry *);
void (*d_iput)(struct dentry *, struct inode *);
};
Each dentry is on five lists, with links through the fields d_hash , d_lru , d_child ,
d_subdirs , d_alias .
Files
File structures represent open files, that is, an inode together with a current
(reading/writing) offset. The offset can be set by the lseek() system call. Note that
instead of a pointer to the inode we have a pointer to the dentry -that means that the
name used to open a file is known. In particular system calls like getcwd() are possible.
struct file {
struct dentry *f_dentry;
struct vfsmount *f_vfsmnt;
struct file_operations *f_op;
mode_t f_mode;
loff_t f_pos;
struct fown_struct f_owner;
unsigned int f_uid, f_gid;
unsigned long f_version;
...
}
Here the f_owner field gives the owner to use for async I/O signals.
struct file_operations {
struct module *owner;
loff_t (*llseek) (struct file *, loff_t, int);
ssize_t (*read) (struct file *, char *, size_t, loff_t *);
ssize_t (*aio_read) (struct kiocb *, char *, size_t, loff_t);
ssize_t (*write) (struct file *, const char *, size_t, loff_t *);
ssize_t (*aio_write) (struct kiocb *, const char *, size_t, loff_t);
int (*readdir) (struct file *, void *, filldir_t);
unsigned int (*poll) (struct file *, struct poll_table_struct *);
int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long);
int (*mmap) (struct file *, struct vm_area_struct *);
int (*open) (struct inode *, struct file *);
int (*flush) (struct file *);
int (*release) (struct inode *, struct file *);
int (*fsync) (struct file *, struct dentry *, int datasync);
int (*aio_fsync) (struct kiocb *, int datasync);
int (*fasync) (int, struct file *, int);
int (*lock) (struct file *, int, struct file_lock *);
ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, loff_t *);
ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, loff_t *);
ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, void *);
ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long,
unsigned long, unsigned long);
Each file is in two lists, with links through the fields f_list , f_ep_links .
f_list
The list with links through f_list was discussed above. It is the list of all files belonging
to a given superblock. There is a second use: the tty driver collects all files that are
opened instances of a tty in a list headed by tty->tty_files with links through the file
field f_list . Conversely, these files point back at the tty via their field private_data .
(This field private_data is also used elsewhere. For example, the proc code uses it to
attach a struct seq_file to a file.)
The event poll list
All event poll items belonging to a given file are collected in a list with head
f_ep_links , protected by the file field
f_ep_lock . (For event poll stuff, see epoll_ctl(2).)
struct vfsmount
A struct vfsmount describes a mount. The definition lives in mount.h :
struct vfsmount {
struct list_head mnt_hash;
struct vfsmount *mnt_parent; /* fs we are mounted on */
struct dentry *mnt_mountpoint; /* dentry of mountpoint */
struct dentry *mnt_root;
/* root of the mounted tree */
struct super_block *mnt_sb;
/* pointer to superblock */
struct list_head mnt_mounts; /* list of children, anchored here */
struct list_head mnt_child;
/* and going through their mnt_child */
atomic_t mnt_count;
int mnt_flags;
char *mnt_devname;
/* Name of device e.g. /dev/dsk/hda1 */
struct list_head mnt_list;
};
fs_struct
A struct fs_struct determines the interpretation of pathnames referred to by a process
(and also, somewhat illogically, contains the umask). The typical reference is current>fs . The definition
lives in fs_struct.h :
struct fs_struct {
atomic_t count;
rwlock_t lock;
int umask;
struct dentry * root, * pwd, * altroot;
struct vfsmount * rootmnt, * pwdmnt, * altrootmnt;
};
Semantics of root and pwd are clear. Remains to discuss altroot .
6. The File System
Virtual File system & its role
●
Files associated with a process
●
proc file system
●System Calls
●
There are two normal cases for handling the descriptors after a fork.
1. The parent waits for the child to complete. In this case, the parent does not need
to do anything with its descriptors. When the child terminates, any of the
shared descriptors that the child read from or wrote to will have their file offsets
updated accordingly.
2. Both the parent and the child go their own ways. Here, after the fork, the
parent closes the descriptors that it doesn’t need, and the child does the same
thing. This way, neither interferes with the other’s open descriptors. This
scenario is often found with network servers.
Besides the open files, numerous other properties of the parent are inherited by the
child:
• Real user ID, real group ID, effective user ID, and effective group ID
• Supplementary group IDs
• Process group ID
• Session ID
• Controlling terminal
• The set-user-ID and set-group-ID flags
• Current working directory
• Root directory
• File mode creation mask
• Signal mask and dispositions
• The close-on-exec flag for any open file descriptors
• Environment
• Attached shared memory segments
• Memory mappings
• Resource limits
The differences between the parent and child are
• The return values from fork are different.
• The process IDs are different.
• The two processes have different parent process IDs: the parent process ID of the
child is the parent; the parent process ID of the parent doesn’t change.
• The child’s tms_utime, tms_stime, tms_cutime, and tms_cstime values are set to 0
• File locks set by the parent are not inherited by the child.
• Pending alarms are cleared for the child.
• The set of pending signals for the child is set to the empty set.
6. The File System
Virtual File system & its role
Files associated with a process
●
●
proc file system
●
System Calls
●
/proc is a window into the running Linux kernel. Files in the /proc file system don’t
corre-spond to actual files on a physical device. Instead, they are magic objects that
behave like files but provide access to parameters, data structures, and statistics in the
kernel. The “contents” of these files are not always fixed blocks of data, as ordinary file
contents are. Instead, they are generated on the fly by the Linux kernel when you read
from the file.You can also change the configuration of the running kernel by writing
to certain files in the /proc file system.
Let’s look at an example:
% ls -l /proc/version
-r--r--r-1 root root
0 Jan 17 18:09 /proc/version
Size is 0 as this generated by kernel
$mount
none on /proc type proc (rw)
none reveals that is not a file systemon disk.
Extracting Information from /proc
#include <stdio.h>
#include <string.h>
/* Returns the clock speed of the system’s CPU in MHz, as reported by
/proc/cpuinfo. On a multiprocessor machine, returns the speed of
the first CPU. On error returns zero. */
float get_cpu_clock_speed ()
{
FILE* fp;
char buffer[1024];
size_t bytes_read;
char* match;
float clock_speed;
/* Read the entire contents of /proc/cpuinfo into the buffer. */
fp = fopen (“/proc/cpuinfo”, “r”);
bytes_read = fread (buffer, 1, sizeof (buffer), fp);
fclose (fp);
/* Bail if read failed or if buffer isn’t big enough. */
if (bytes_read == 0 || bytes_read == sizeof (buffer))
return 0;
/* NUL-terminate the text. */
buffer[bytes_read] = ‘\0’;
/* Locate the line that starts with “cpu MHz”. */
match = strstr (buffer, “cpu MHz”);
if (match == NULL)
return 0;
/* Parse the line to extract the clock speed. */
sscanf (match, “cpu MHz : %f”, &clock_speed);
return clock_speed;
}
int main ()
{
printf (“CPU clock speed: %4.0f MHz\n”, get_cpu_clock_speed ());
return 0;
}
Various directories and files in /proc
1)/proc/<number> # for processes running
2)/proc/self
#for current process
3)/proc/cpuinfo
4)/proc/devices
5)/proc/pci #summary of devices connected to pci bus
6)/proc/tty/driver/serail #serial ports
7)/proc/sys/kernel #kernel information
8)/proc/meminfo #system's memory usage
9)/proc/filesystem #filesystems mounted in kernel
10) /proc/mount #all mounted filesytems
6. The File System
Virtual File system & its role
Files associated with a process
●
●
System Calls
●
1. fcntl Record Locking
#include <fcntl.h>
int fcntl(int fd, int cmd);
int fcntl(int fd, int cmd, long arg);
int fcntl(int fd, int cmd, struct flock *lock);
Returns: depends on cmd if OK (see following), −1 on error
For record locking, cmd is F_GETLK, F_SETLK, or F_SETLKW.
struct flock {
short l_type; /* F_RDLCK, F_WRLCK, or F_UNLCK */
short l_whence; /* SEEK_SET, SEEK_CUR, or SEEK_END */
off_t l_start; /* offset in bytes, relative to l_whence */
off_t l_len; /* length, in bytes; 0 means lock to EOF */
pid_t l_pid; /* returned with F_GETLK */
};
# include <stdio.h>
# include<fcntl.h>
Main() {
int fd, pid, retval;
struct flock lockc, lockp;
fd = open("testlock",O_WRONLY);
lockp.l_type = F_WRLCK;
lockp.l_whence = 0;
lockp.l_start = 10;
lockp.l_len = 15;
if((retval = fcntl(fd, F_SETLK,&lockp)) == -1) // Parent is locking the file
perror("parent write lock\n");
printf("retval is %d\n",retval);
if((pid = fork()) == 0){
lockc.l_type = F_WRLCK;
lockc.l_whence = 0;
lockc.l_start = 40;
lockc.l_len = 55;
//Child is locking the file
if((retval = fcntl(fd, F_SETLK,&lockc)) == -1)perror("Child write lock\n");
printf("retval is %d\n",retval);
printf("Child Process over\n");
}
else {
sleep(3);
lockp.l_type = F_UNLCK;
lockp.l_whence = 0;
lockp.l_start = 10;
lockp.l_len = 15;
// Parent is unlocking the file
if((retval = fcntl(fd, F_SETLK,&lockp)) == -1)perror("parent write lock\n");
printf("Parent Process over\n");
}
}
Both are trying to make READ LOCK, So success full, can try at WRITE LOCK
# include <stdio.h>
# include<fcntl.h>
main()
{
int fd, pid, retval;
struct flock lockc, lockp;
fd = open("testlock",O_RDONLY);
lockp.l_type = F_RDLCK;
lockp.l_whence = 0; //SEEK_SET
lockp.l_start = 10;
lockp.l_len = 15;
if((retval = fcntl(fd, F_SETLK,&lockp)) == -1) // Parent is locking the
file
perror("parent read lock\n");
printf("Parent retval is %d\n",retval);
//Child starts here
if((pid = fork()) == 0){
if((retval = fcntl(fd, F_GETLK,&lockc)) == -1)
perror("child write lock\n");
printf("retval is %d\n",retval);
printf("process %d has locked this section\n",lockc.l_pid);
printf("lock type
%d\n",lockc.l_type);
printf("whence
%d\n",lockc.l_whence);
printf("start
%d\n",lockc.l_start);
printf("lenth is
%d\n",lockc.l_len);
lockc.l_type = F_RDLCK;
lockc.l_whence = 0;
lockc.l_start = 10;
lockc.l_len = 15;
//Child is locking the file
if((retval = fcntl(fd, F_SETLK,&lockc)) == -1)
perror("Child read lock\n");
printf("Child retval is %d\n",retval);
printf("Child Process over\n");
}
else {
sleep(3);
printf("Parent Process over\n");
}
}
2. lockf
SYNOPSIS
#include <unistd.h>
int lockf(int fd, int cmd, off_t len);
- apply, test or remove a POSIX lock on an open file
DEADLOCK, avoid deadlock with F_TLOCK in child lockf() call
# include <fcntl.h>
# include <unistd.h>
main()
{
int fd, retvelue;
pid_t pid;
char
buff[100];
if((fd = open("locktest",O_RDWR|O_CREAT, 0666)) == -1)
perror("open file locktest\n");
if(lockf(fd,F_LOCK,10) == -1)
perror("lockf failed");
if((pid = fork()) == 0){
if(lockf(fd,F_LOCK,10) == -1)
//child blocked dead lock....!
perror("lockf failed");
puts("The child process over");
}
else {
wait(0);
printf("Process %d is over\n",getpid());
}
}
3. access
#include <unistd.h>
int access(const char *pathname, int mode);
access() checks whether the process would be allowed to read, write or test for
existence of the file (or other file system object) whose name is pathname. If
pathname is a symbolic link permissions of the file referred to by this
symbolic link are tested.
mode is a mask consisting of one or more of R_OK, W_OK, X_OK and F_OK.
R_OK, W_OK and X_OK request checking whether the file exists and has read, write
and execute permissions, respectively. F_OK just requests checking for the
existence of the file.
#include<errno.h>
#include<stdio.h>
#include<unistd.h>
int main(int argc, char* argv[]) {
char* path = argv[1];
int ret;
ret = access(path,F_OK); // check for file exists
if(ret == 0)
printf(" %s file exists\n",path);
}
4. create
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
int creat(const char *pathname, mode_t mode);
5. dup, dup2
#include <unistd.h>
int dup(int oldfd);
int dup2(int oldfd, int newfd);
dup() and dup2() create a copy of the file descriptor oldfd.
After a successful return from dup() or dup2(), the old and new file descriptors
may be used interchangeably. They refer to the same open file descriptor thus share
file offset and file status flags; for example, if the file offset is modified by using
lseek(2) on one of the descriptors, the offset is also changed for the other.
The two descriptors do not share file descriptor flags (the close-on-exec flag).
The close- on-exec flag (FD_CLOEXEC;
dup() uses the lowest-numbered unused descriptor for the new descriptor.
dup2() makes newfd be the copy of oldfd, closing newfd first if necessary.
# include <stdio.h>
# include <stdlib.h>
# include <fcntl.h>
# include <sys/stat.h>
main(){
int fd, newfd;
if((fd = creat("testfile",0666)) == -1){
perror("Creat failed\n");
exit(0);
}
printf("Descriptor is %d",fd);
newfd= dup2(fd,5);//try with stdout
printf("\nNew Descriptor is %d\n",newfd);
printf("The PID is %d\n",getpid());
for(;;);
close(fd);
close(newfd);
}
Using fcntl to create a copy
# include <stdio.h>
# include <fcntl.h>
main()
{
int fd,fd1, newfd;
fd = open("temp",O_RDWR | O_CREAT ,0666);
printf("The file discriptor is %d\n",fd);
fd1 = open("temp1",O_RDWR | O_CREAT ,0666);
newfd=fcntl(fd,F_DUPFD,NULL);
printf("The file discriptor is %d\n",newfd);
}
6. mmap
#include <sys/mman.h>
void *mmap(void *start, size_t length, int prot, int flags, int fd, off_t
offset);
int munmap(void *start, size_t length);
The mmap() function asks to map length bytes starting at offset offset from the
file (or other object) specified by the file descriptor fd into memory, preferably at
address start.
This latter address is a hint only, and is usually specified as 0. The actual place
where the object is mapped is returned by mmap().
The prot argument describes the desired memory protection (and must not
conflict with the open mode of the file). It is either PROT_NONE or is the
bitwise OR of one or more of the other PROT_* flags.
PROT_EXEC Pages may be executed.
PROT_READ Pages may be read.
PROT_WRITE Pages may be written.
PROT_NONE Pages may not be accessed.
The flags parameter specifies the type of the mapped object, mapping options
and whether modifications made to the mapped copy of the page are private to the
process or are to be shared with other references. It has bits
MAP_FIXED Do not select a different address than the one specified. If the
memory region specified by start.
MAP_SHARED Share this mapping with all other processes that map this object.
Storing to the region is equivalent to writing to the file.
MAP_PRIVATE Create a private copy-on-write mapping. Stores to the region
do not affect the original file. It is unspecified whether changes made to the file
after the mmap() call are visible in the mapped region.
#include<unistd.h>
#include<stdlib.h>
#include<sys/mman.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<fcntl.h>
#include<stdio.h>
#include<syscall.h>
main(int argc,char *argv[])
{
int fd;
void *addr;
if( (fd=open(argv[1],O_RDWR|O_CREAT,0777)) < 0) {
perror("open");
exit(1);
}
lseek(fd,5,SEEK_SET);
write(fd,"",1);
//lseek(fd,0,SEEK_SET);
addr=mmap(0,5,PROT_WRITE,MAP_SHARED,fd,0); /* Mapping the file to memory*/
close(fd);
sprintf(addr,"%s\n","hello");
munmap(addr,5);
}
//#define BCM2708_PERI_BASE
#define BCM2708_PERI_BASE
#define GPIO_BASE
*/
#include
#include
#include
#include
#include
0x20000000
0x3F000000
(BCM2708_PERI_BASE + 0x200000) /* GPIO controller
<stdio.h>
<stdlib.h>
<fcntl.h>
<sys/mman.h>
<unistd.h>
#define PAGE_SIZE (4*1024)
#define BLOCK_SIZE (4*1024)
int mem_fd;
void *gpio_map;
// I/O access
volatile unsigned *gpio;
// GPIO setup macros. Always use INP_GPIO(x) before using OUT_GPIO(x) or
SET_GPIO_ALT(x,y)
#define INP_GPIO(g) *(gpio+((g)/10)) &= ~(7<<(((g)%10)*3))
#define OUT_GPIO(g) *(gpio+((g)/10)) |= (1<<(((g)%10)*3))
#define SET_GPIO_ALT(g,a) *(gpio+(((g)/10))) |= (((a)<=3?(a)+4:(a)==4?3:2)<<(((g)
%10)*3))
//#define GPIO_SET *(gpio+7) // sets
bits which are 1 ignores bits which are 0
//#define GPIO_CLR *(gpio+10) // clears bits which are 1 ignores bits which are 0
//temporarily introduced for pint 4
#define GPIO_SET *(volatile unsigned int*)(gpio+7) |= 0x10 // sets
bits which
are 1 ignores bits which are 0
#define GPIO_CLR *(volatile unsigned int*)(gpio+10)|= 0x10 // clears bits which
are 1 ignores bits which are 0
#define GPIO_READ(g) *(gpio + 13) &= (1<<(g))
void setup_io();
int main(int argc, char **argv)
{
int g,rep;
//
//
//
//
// Set up gpi pointer for direct register access
setup_io();
// set GPIO pin 7 as output
INP_GPIO(7); // must use INP_GPIO before we can use OUT_GPIO
INP_GPIO(4); // must use INP_GPIO before we can use OUT_GPIO
OUT_GPIO(7);
OUT_GPIO(4);
// flash LED on and off 10 times
for (rep = 0; rep < 10; rep++) {
GPIO_SET = (1 << 7);
printf("setting\n");
GPIO_SET;
sleep(1);
GPIO_CLR = (1 << 7);
printf("resetting\n");
GPIO_CLR;
sleep(1);
}
return 0;
} // main
// Set up a memory regions to access GPIO
void setup_io()
{
/* open /dev/mem */
if ((mem_fd = open("/dev/mem", O_RDWR|O_SYNC) ) < 0) {
printf("can't open /dev/mem \n");
exit(-1);
}
/* mmap GPIO */
gpio_map = mmap(
NULL,
//Any adddress in our space will do
BLOCK_SIZE,
//Map length
PROT_READ|PROT_WRITE, // Enable reading & writting to mapped memory
MAP_SHARED,
//Shared with other processes
mem_fd,
//File to map
GPIO_BASE
//Offset to GPIO peripheral
);
close(mem_fd); //No need to keep mem_fd open after mmap
if (gpio_map == MAP_FAILED) {
printf("mmap error %d\n", (int)gpio_map); //errno also set!
exit(-1);
}
// Always use volatile pointer!
gpio = (volatile unsigned *)gpio_map;
} // setup_io()
7. mount
mount [-lhV]
mount -a [-fFnrsvw] [-t vfstype] [-O optlist]
mount [-fnrsvw] [-o options [,...]] device | dir
mount [-fnrsvw] [-t vfstype] [-o options] device dir
Mount a file system
All files accessible in a Unix system are arranged in one big tree, the file
hierarchy, rooted at /. These files can be spread out over several devices. The mount
command serves to attach the file system found on some device to the big file
tree. Conversely, the umount(8) command will detach it again.
The standard form of the mount command, is
mount -t type device dir
#include<sys/mount.h>
#include<stdio.h>
main(){
int fd;
fd = mount("/dev/fd0","/mnt/floppy","ext2",MS_NOSUID,NULL);
if(fd != -1)
printf(" Floppy mounted successfully\n");
printf(" Changing Directory to floppy\n");
chdir("/mnt/floppy");
printf(" Creating a file test_file in floppy\n");
fd = creat("test_file",0644);
if (fd != -1)
printf(" File Creation successful\n");
}
8. readv, wr itev
#include <sys/uio.h>
ssize_t readv(int fd, const struct iovec *vector, int count);
ssize_t writev(int fd, const struct iovec *vector, int count);
readv, writev - read or write data into multiple buffers
The readv() function reads count blocks from the file associated with the file
descriptor fd into the multiple buffers described by vector.
The writev() function writes at most count blocks described by vector to the file
associated with the file descriptor fd.
The pointer vector points to a struct iovec defined in <sys/uio.h> as
struct iovec {
void *iov_base; /* Starting address */
size_t iov_len; /* Number of bytes */
};
# include<stdio.h>
# include <fcntl.h>
# include <sys/uio.h>
struct emp{
char name[25];
int age;
float sal;
}obj[2], Emp [2]={{"Hello",10,123.345},{"World",20,234.567}};
main()
{
struct iovec readiovobj,ioobj;
int fd;
int retval;
ioobj.iov_base = Emp;
ioobj.iov_len =sizeof(Emp);
printf("%d",ioobj.iov_len );
fd = open("temp",O_CREAT|O_RDWR,0666);
retval=writev(fd,&ioobj,1);
printf("%d",retval);
}
lseek(fd,0,SEEK_SET);
readiovobj.iov_base = obj;
readiovobj.iov_len =sizeof(Emp);
retval=readv(fd,&readiovobj,1);
printf("%d",retval);
9. pread, pwr ite
#define _XOPEN_SOURCE 500
#include <unistd.h>
ssize_t pread(int fd, void *buf, size_t count, off_t offset);
ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset);
pread, pwrite - read from or write to a file descriptor at a given offset
pread() reads up to count bytes from file descriptor fd at offset offset (from the
start of the file) into the buffer starting at buf. The file offset is not changed.
pwrite() writes up to count bytes from the buffer starting at buf to the file
descriptor fd at offset offset. The file offset is not changed.
The file referenced by fd must be capable of seeking.
#include<stdio.h>
#include<sys/stat.h>
#include<sys/types.h>
#include<fcntl.h>
#include<unistd.h>
main()
{
int fd1, fd2,n;
char ch[1024];
if((fd1 = open("/etc/passwd",O_RDONLY)) == -1)
perror("Un able to open source");
n = pread(fd1,ch,100,100);
printf(ch);
close(fd1);
if((fd2 = open("newfile",O_WRONLY,0666)) == -1){
perror("Un able to open target");
exit(1);
}
pwrite(fd2,"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",40,500);
pwrite(fd2,"YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY",40,500);
}
close(fd2);
Day 3 - Morning
7. Process Management
●Process Defined
Process Descriptor Structures in the kernel
Process States
Process Scheduling
●Process Creation
●System calls related to process management
●
●
●
- A Process is a file in file system.
- A Process is object code in execution—active, alive, running programs
- Processes are more than just assembly language; they consist of data, resources, state,
and a virtualized computer.
- A process uses many resources like memory space, CPU, files, etc., during its lifetime.
- A Process contains threads, contained in a process group and has parent Process.
A process group contained in Session. Session has tty, terminal attached to it where
at most one process group (Foreground process group) attached to the terminal.
Rest detached process groups are background process group.
- A Process is sub program that is scheduled, by kernel, to the process for execution.
Main thread in a process is actual entity that get scheduled to the CPU. Kernel
maintains separate copy of registers and various other data structure for a process.
- In multi processing environment register values in context of process gets loaded to
actual register when execution resumes.
- A process is an entry in task vector, and is an instance of task_struct.
7. Process Management
Process Defined
●
Process Descriptor Structures in the
kernel
●
Process States
Process Scheduling
Process Creation
●System calls related to process management
●
●
●
Process Structure
• Every process is represented by a task_struct data structure.
• This structure is quite large and complex.
• When ever a new process is created a new task_struct
structure is created by the kernel and the complete process
information is maintained by the structure.
• When a process is terminated, the corresponding structure is
removed.
• Uses doubly linked list data structure.
• Solaris uses proc structure to manage processes.
task_struct task[256];
struct task_struct {
volatile long state;
void *stack;
atomic_t usage;
unsigned int flags;
unsigned int ptrace;
/* -1 unrunnable, 0 runnable, >0 stopped */
/* per process flags, defined below */
#ifdef CONFIG_SMP
struct llist_node wake_entry;
int on_cpu;
struct task_struct *last_wakee;
unsigned long wakee_flips;
unsigned long wakee_flip_decay_ts;
int wake_cpu;
#endif
int on_rq;
int prio, static_prio, normal_prio;
unsigned int rt_priority;
const struct sched_class *sched_class;
struct sched_entity se;
struct sched_rt_entity rt;
#ifdef CONFIG_CGROUP_SCHED
struct task_group *sched_task_group;
#endif
#ifdef CONFIG_PREEMPT_NOTIFIERS
/* list of struct preempt_notifier: */
struct hlist_head preempt_notifiers;
#endif
/*
* fpu_counter contains the number of consecutive context switches
* that the FPU is used. If this is over a threshold, the lazy fpu
* saving becomes unlazy to save the trap. This is an unsigned char
* so that after 256 times the counter wraps and the behavior turns
* lazy again; this to deal with bursty apps that only use FPU for
* a short time
*/
unsigned char fpu_counter;
#ifdef CONFIG_BLK_DEV_IO_TRACE
unsigned int btrace_seq;
#endif
unsigned int policy;
int nr_cpus_allowed;
cpumask_t cpus_allowed;
#ifdef CONFIG_PREEMPT_RCU
int rcu_read_lock_nesting;
char rcu_read_unlock_special;
struct list_head rcu_node_entry;
#endif /* #ifdef CONFIG_PREEMPT_RCU */
.#ifdef CONFIG_TREE_PREEMPT_RCU
struct rcu_node *rcu_blocked_node;
#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
#ifdef CONFIG_RCU_BOOST
struct rt_mutex *rcu_boost_mutex;
#endif /* #ifdef CONFIG_RCU_BOOST */
#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
struct sched_info sched_info;
#endif
struct list_head tasks;
#ifdef CONFIG_SMP
struct plist_node pushable_tasks;
#endif
struct mm_struct *mm, *active_mm;
#ifdef CONFIG_COMPAT_BRK
unsigned brk_randomized:1;
#endif
#if defined(SPLIT_RSS_COUNTING)
struct task_rss_stat
rss_stat;
#endif
/* task state */
int exit_state;
int exit_code, exit_signal;
int pdeath_signal; /* The signal sent when the parent dies
unsigned int jobctl;
/* JOBCTL_*, siglock protected */
.
.
*/
In order to run unix, the computer hardware must provide two modes of execution:
– kernel mode
– user mode
Some computers have more than two execution modes.
– eg: Intel processor. It has four modes of execution.
Each process has virtual address space; references to virtual memory are translated to
physical memory locations using set of address translation maps.
7. Process Management
Process Defined
Process Descriptor Structures in the kernel
●
●
Process States
●
Process Scheduling
●Process Creation
●System calls related to process management
●
Process States
7. Process Management
Process Defined
Process Descriptor Structures in the kernel
Process States
●
●
●
Process Scheduling
●
Process Creation
System calls related to process management
●
●
Scheduling (Kernel perspective)
• The kernel keeps track of a processes creation time as well as
the CPU time that it consumes during its lifetime.
• This clock is the combination of software and hardware setup.
• It is independent of CPU frequency.
• A clock tick unit is Jiffy. System‘s interactive response
depends on the clock frequency.
– For example: the jiffy value may be 10ms (100Hz) or 1ms
(1000Hz) depending on implementation
Each clock tick, the kernel updates the amount of
time that the current process has spent in system
and in user mode.
• Linux also supports process specific interval
timers, processes can use system calls to set up
timers to send signals to themselves when the
timers expire. These timers can be single-shot or
periodic timers.
Process Scheduling
• The job of a scheduler is to select the most
deserving process to run out of all of the runnable
processes in the run queue.
• Implement fair scheduling to avoid starvation
• Implement suitable scheduling policy
• Updates state of the processes in every clock tick
(jiffy)
Policy - FIFO, Round Robin, Shortest Job First,
FILO, Priority based etc.
• Priority - higher priority process will be allowed
to run.
• Pre-emptive and Non-preemptive scheduling.
• rt_priority – many UNIX variants support real
time scheduling priority range.
Priority Range
Scheduling priorities (in a typical UNIX system)
have integer values
between 0 and 127, with smaller numbers
meaning higher priorities.
• For Solaris: 0 to 169
• For Linux: 0 to 139
Process Scheduling: Linux
• The Linux kernel implements two separate priority ranges.
• The first is the nice value, a number from -20 to 19 with a
default of zero. Larger nice values correspond to a lower
priority.
• A process with a nice value of -20 receives the maximum
time slice, whereas a process with a nice value of 19
receives the minimum time slice.
• Time slice: minimum -10ms, default -150ms and maximum
– 300ms
• The second range is the real-time priority
• By default, it ranges from zero to 99.
• All real time processes are at a higher priority
than normal processes.
• Linux implements real-time priorities in
accordance with POSIX.
• Linux provides two real-time scheduling policies,
SCHED_FIFO and SCHED_RR.
• The normal non real-time scheduling policy is
SCHED_OTHER.
• SCHED_FIFO implements without time slices- so it can run
until it blocks or explicitly yields the processor.
• SCHED_RR is identical to SCHED_FIFO except that each
process can only run until it exhausts a predetermined time
Slice.
Scheduler System Calls
nice() Set a process's nice value
sched_setscheduler() Set a process's scheduling policy
sched_getscheduler() Get a process's scheduling policy
sched_setparam() Set a process's real-time priority
sched_getparam() Get a process's real-time priority
sched_get_priority_max() Get the maximum real-time priority
sched_get_priority_min() Get the minimum real-time priority
sched_rr_get_interval() Get a process's timeslice value
7. Process Management
Process Defined
Process Descriptor Structures in the kernel
Process States
●Process Scheduling
●
●
●
Process Creation
●
System calls related to process management
●
Process Creation
Parent process creates children processes, which, in turn create other
processes, forming a tree of processes.
Resource sharing
Parent and children share all resources.
Children share subset of parent’s resources.
Parent and child share no resources.
Execution
Parent and children execute concurrently.
Parent waits until children terminate.
Address space
Child duplicate of parent.
Child has a program loaded into it.
fork ( )
• pid_t fork (void); creates a new process.
• All statements after the fork() system call in a program are
executed by two processes - the original process that used
fork(), plus the new process that is created by fork( ).
main ( ) {
printf (“ Hello fork %d\n, fork ( ) ”);
}
– Hello fork: 0
– Hello fork: x ( > 0);
– Hello fork: -1
Parent and Child
if (!fork( )) {
/* Child Code */
}
else {
/* parent code */
wait (0); /* or */
waitpid(pid, ....);
}
Zombie State and Orphan Process
• When a child process exits, it has to give the exit
status to the parent process.
• If the parent process is busy or suspended then the
child process will not be able to terminate.
• Such state is called Zombie.
• If parent exits before child, the child will become
an orphan process and the init process (grand
parent) will take care of the child process.
Copy on Write (COW)
• Instead of copying the address space of the parent, UNIX
uses the COW technique for economical use of the memory
page.
• The parent space is not copied, it can be shared by both the
parent and the child process but the memory pages are
marked as write protected.
• If parent or child wants to modify the pages, then kernel
copies the parent pages to the child process.
• Advantage: Kernel can defer or prevent copying of a parent
process address space.
execl
To run a new program in a process, you use one of
the “exec” family of
calls (such as “execl”) and specify following:
• the pathname of the program to run
• the name of the program
• each parameter to the program
• (char *)0 or NULL as the last parameter to
specify end of parameter list
exec Family
int execl (const char *path, const char *arg, .....);
int execlp (const char *file, const char *arg);
int execle (const char *path, const char *arg, ......., char *const envp[ ]);
int execv (const char *path, char *const argv[ ]);
int execvp (const char *file, char *const argv[ ]);
All the above library functions call internally execve system call.
int execve (const char *filename, char *const argv [ ] , char *const evnp [ ]);
Text Portion
• User Context consists portions accessible to the
process while running in user mode.
• The text portion of a process contains the actual
machine instructions that are executed by the
hardware.
• When a program is executed by the OS, the text
portion is read into memory from its disk file,
unless the OS supports shared text and a copy of
program is already being executed.
Data Portion
• The data portion contains the program’s data. It is
possible for this to be divided into 3 pieces.
• Initialized read only data contains elements that are
initialized by the program and are read only while the
process is executing.
• Initialized read write data contains data elements that
are initialized by the program and may have their
values modified during execution of the process.
Stack Portion
• Un-initialized data contains data elements that are
not initialized by the program but are set to zero
before execution starts .
• The heap is used while a process is running to
allocate more data space dynamically to the
process.
• The stack is used dynamically while the process is
running to contain the stack frames that are used
by many programming languages.
Kernel Context
• The stack frames contain the return address linkage for
each function call and also the data elements required
by a function.
• A gap is shown between heap and stack to indicate that
many OS leave some room between these 2 portions, so
that both can grow dynamically.
• The kernel context of a process is maintained and
accessible only to the kernel. This area contains info
that the kernel needs to keep track of the process and to
stop and restart the process while other processes are
allowed to execute.
Daemon Process
Introduction
• Daemon process starts during system startup.
• They frequently spawn other process to handle services
requests.
– Mostly started by initialization script /etc/rc
• Waits for an event to occur.
• perform some specified task on periodic basis (cron
job)
• perform the requested service and wait
– Example print server
Characteristics
• executed at the background process
• Orphan process
• No controlling terminal
• run with super user privileges
• process group leaders
• session leaders
How to daemonize
1. Call umask to set the file mode creation mask to a known value, usually 0.
2. Call fork and have the parent exit. Child inherits the process group ID of the parent
but gets a new process ID, so we’re guaranteed that the child is not a process group
leader. This is a prerequisite for the call to setsid that is done next.
3. Call setsid to create a new session. The three steps listed in Section 9.5 occur.
The process (a) becomes the leader of a new session, (b) becomes the leader of a
new process group, and (c) is disassociated from its controlling terminal.
4. Change the current working directory to the root directory. The current
working directory inherited from the parent could be on a mounted file system.
5. Unneeded file descriptors should be closed. This prevents the daemon from
holding open any descriptors that it may have inherited from its parent (which
could be a shell or some other process).
6. Some daemons open file descriptors 0, 1, and 2 to /dev/null so that any
library routines that try to read from standard input or write to standard output
or standard error will have no effect.
$ ps -axj #to get all daemon process, does not have terminal
#include "apue.h"
#include <syslog.h>
#include <fcntl.h>
#include <sys/resource.h>
void
daemonize(const char *cmd)
{
int
i, fd0, fd1, fd2;
pid_t
pid;
struct rlimit
rl;
struct sigaction
sa;
/*
* Clear file creation mask.
*/
umask(0);
/*
* Get maximum number of file descriptors.
*/
if (getrlimit(RLIMIT_NOFILE, &rl) < 0)
err_quit("%s: can’t get file limit", cmd);
/*
* Become a session leader to lose controlling TTY.
*/
if ((pid = fork()) < 0)
err_quit("%s: can’t fork", cmd);
else if (pid != 0) /* parent */
exit(0);
setsid();
/*
* Ensure future opens won’t allocate controlling TTYs.
*/
sa.sa_handler = SIG_IGN;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
if (sigaction(SIGHUP, &sa, NULL) < 0)
err_quit("%s: can’t ignore SIGHUP", cmd);
if ((pid = fork()) < 0)
err_quit("%s: can’t fork", cmd);
else if (pid != 0) /* parent */
exit(0);
/*
* Change the current working directory to the root so
* we won’t prevent file systems from being unmounted.
*/
if (chdir("/") < 0)
err_quit("%s: can’t change directory to /", cmd);
/*
* Close all open file descriptors.
*/
if (rl.rlim_max == RLIM_INFINITY)
rl.rlim_max = 1024;
for (i = 0; i < rl.rlim_max; i++)
close(i);
/*
* Attach file descriptors 0, 1, and 2 to /dev/null.
*/
fd0 = open("/dev/null", O_RDWR);
fd1 = dup(0);
fd2 = dup(0);
/*
* Initialize the log file.
*/
openlog(cmd, LOG_CONS, LOG_DAEMON);
if (fd0 != 0 || fd1 != 1 || fd2 != 2) {
syslog(LOG_ERR, "unexpected file descriptors %d %d %d",
fd0, fd1, fd2);
exit(1);
}
}
7. Process Management
Process Defined
Process Descriptor Structures in the kernel
Process States
●Process Scheduling
●Process Creation
●
●
●
System calls related to process
management
●
1. wait, waitpid
#include <sys/types.h>
#include <sys/wait.h>
pid_t wait(int *status);
pid_t waitpid(pid_t pid, int *status, int options);
int waitid(idtype_t idtype, id_t id, siginfo_t *infop, int options);
wait, waitpid - wait for process to change state
A state change is considered to be: the child terminated; the child was stopped
by a signal; or the child was resumed by a1. wait, waitpid signal. In the case of a
terminated child, performing a wait allows the system to release the resources
associated with the child; if a wait is not performed, then termi nated the child
remains in a "zombie" state.
If a child has already changed state, then these calls return immediately.
Otherwise they block until either a child changes state or a signal handler
interrupts the call (assuming that system calls are not automatically restarted using
the SA_RESTART flag of sigaction(2)).
waitpid(-1, &status, 0);
The value of pid can be:
< -1 meaning wait for any child process whose process group ID is equal to the
absolute value of pid.
-1 meaning wait for any child process.
0
meaning wait for any child process whose process group ID is equal to
that of the calling process.
> 0 meaning wait for the child whose process ID is equal to the value of pid.
#include <stdio.h>
int main () {
int i=0,pid;
printf ("Ready to fork\n");
pid = fork();
if (pid == 0)
{
printf ("Child starts\n");
for(i=0;i<1000;i++) printf ("%d\t",i);
printf ("Child ends\n");
//
sleep(30); uncomment this to get child orphaned process
}else {
Wait(0); //comment and sleep to get child as zombie process
printf ("Parent process\n");
}
}
2. exec
#include <unistd.h>
extern char **environ;
int execl(const char *path, const char *arg, ...);
int execlp(const char *file, const char *arg, ...);
int execle(const char *path, const char *arg,
..., char * const envp[]);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
execl, execlp, execle, execv, execvp - execute a file
execl
#include <stdio.h>
int main ()
{
int pid;
pid = fork();
if (pid == 0)
{
printf ("Exec starts\n");
execl("/bin/ls","ls","-l",(char *)0);
printf ("Execl did not work\n");
}
else
{
wait(0);
printf ("Parent:Is completed in child\n");
}
}
execv
#include <stdio.h>
int main ()
{
char *temp[4];
temp[0] = "ls";
temp[1] = "-l";
temp[2] = (char *)0;
execv("/bin/ls",temp);
printf ("This will not print\n");
}
Day 3 - Morning
8. Memory Management
●Defining and Creating secondary memory
areas
Memory allocation & deallocation system calls malloc,calloc, alloca, free
Demand Paging defined
Process Organization in Memory
●Address Translation and page fault handling
●Virtual Memory Management
●
●
●
Factors to be considered while designing secondary memory
Latency, Throughput and Bandwidth
Latency - : Amount of time for a single operation to execute.
Throughput - Rate at which operations get executed.
Normally expressed as Operations/second.
In sequential processing throughput = 1 /latency
Bandwidth - : Total rate at which data moves between processor and memory.
Product of throughput and datawidth
Pipelining, Parallelism and Pre-charging
Memory systems can be pipelined similar to the processors are pipelined, allowing
operations to overlap execution to improve throughput.
Many memory technologies require a certain delay (idle time ) between operations to
pre-charge circuitry for the next access.
Attaching multiple memories to the processor’s memory bus allows parallelism. This
increases the rate at which memory is accessed without increasing the pin count of the
processor.
Two kinds of systems that support parallelism - Replicated & Banked.
Replicated provides multiple copies of entire memory. Store needs to write into all
copies ( more expensive than loads ).
Banked memory - Data is divided or interleaved across memories.
Example:
What is the bandwidth of a memory system with a latency of 40 ns that transfers 1
byte per operation and is pipelined to allow 4 operations to overlap execution (
assume no pipelining overhead ) ?
Dividing latency 40 ns by number of overlapped operations ( 4 ) gives a rate of 1
operation per 10 ns as the throughput of the memory system. At 1 byte of data per
operation, this gives a bandwidth of
100 Mbyte/sec.
Process A
Process B
Pr imar y Cache
MMU
TLB
Page Table
Page Table
Main memor y
(Physical memor y,
Secondar y mem)
Disk
Virtual memory
Buffer
cache
Page
cache
Swap
cache
Levels in the Memory Hierarchy
Cache :
1. Generally implemented using SRAM.
2. Use hardware to keep track of addresses stored in them.
3. Tend to be small ( capacity ).
4. Small Block Sizes ( 32 to 128 bytes ).
Main Memory:
1. Generally implemented using DRAM.
2. Use software to keep track of addresses.
3. Larger capacity ( Few MB to several Gigabytes ).
4. Larger Block Sizes ( several kilobytes ).
Virtual Memory:
1. Implemented using disks.
2. Contains all of the data in the memory system.
Some terminology...
Hit : When an address is found at a given hierarchy.
Miss: When an address is NOT found at a given hierarchy.
Hit Rate: % of references that reach a given level & result in hits.
Miss Rate: % of references that reach a given level & result in misses.
Note: Hit Rate + Miss Rate = 100% ALWAYS.
When a miss occurs, a BLOCK of data is brought in from a lower level into the
current level of the hierarchy. As time progresses, the current level may fill up, and run
out of free space. A block must be removed to accommodate the new block. This is
called eviction or replacement. The method to decide on what block to remove is
called replacement policy.
To simplify evicting data blocks, many memory systems maintain a property called
inclusion. The presence of an address at a given level of a memory hierarchy
GUARANTEES that the address is present in ALL LOWER LEVELS of the memory
system.
Computing average access times in a memory hierarchy...
If we know the hit-rate and access-time ( time to complete a request that hits ) for
each level in the hierarchy, we can compute average access time of the memory
hierarchy. For each level in the hierarchy, the average access time is
( T hit x P hit ) + ( T miss x P miss )
Where T hit = Time to resolve requests that hit in the level
P hit = Hi-rate of the level, expressed as a probability.
T miss = Average access time of the level below this one. rate of the level.
P miss = Miss
Note that Hit-rate of the lowest level is 100%, we start at the bottom and compute the
average access time of each level upwards in the hierarchy.
Example:
A memory system contains a cache, a DRAM and a Virtual Store. The access time of
the cache is 5 ns with a hit-rate of 80%, whereas the access time of the DRAM is 100
ns with a 99.5 % hit-rate. The access time of the virtual store is 10 ms. What is the
average access time of the hierarchy ?
We start at the bottom and work upwards:
The hit-rate of Virtual store is always 100%.
Average access time for requests that reach DRAM
= ( 100 ns x 0.995 ) + ( 10 ms x 0.005 ) = 50,099.5 ns
The average access time for requests that reach the cache
( which is ALL REQUESTS !!)
= ( 5 ns x 0.80 ) + ( 50,099.5 ns x 0.20 ) = 10,024 ns
SRAM and DRAM Chips
These have the same basic structure ( shown in next slide )
Data is stored in rectangular array of bit cells, each holding 1 bit. To read data from the
array, half of the address to be read ( generally high order bits) is fed into a decoder.
The decoder asserts (drives high) the word line corresponding to the value of its input
bits, which causes all of the bit cells in the corresponding row to drive their values
onto bit lines that they are connected to.
The other half of the address is then used as an input to a multiplexer that selects the
appropriate bit line and drives its output onto the output pins of the chip.
To store data on the chip, the same process is used, except the value to be written is
driven on appropriate bit line and written into the selected bit cell.
Scheduling (Kernel perspective)
• The kernel keeps track of a processes creation time as well as
the CPU time that it consumes during its lifetime.
8. Memory Management
Defining and Creating secondary memory areas
●
Memory allocation & deallocation system
calls malloc,calloc, alloca, free
●
Demand Paging defined
Process Organization in Memory
Address Translation and page fault handling
●Virtual Memory Management
●
●
●
#include <stdlib.h>
void
void
void
void
*malloc(size_t size);
free(void *ptr);
*calloc(size_t nmemb, size_t size);
*realloc(void *ptr, size_t size);
The malloc() function allocates size bytes and returns a pointer to the allocated
memory. The memory is not initialized. If size is 0, then malloc() returns either
NULL, or a unique pointer value that can later be successfully passed to free().
The free() function frees the memory space pointed to by ptr, which must have
been returned by a previous call to malloc(), calloc(), or realloc().
The calloc() function allocates memory for an array of nmemb elements of size bytes
each and returns a pointer to the allocated memory. The memory is set to zero. If
nmemb or size is 0, then calloc() returns either NULL, or a unique pointer value that
can later be successfully passed to free().
The realloc() function changes the size of the memory block pointed to by ptr to size
bytes. The contents will be unchanged in the range from the start of the region up to
the minimum of the old and new sizes.
#include <alloca.h>
void *alloca(size_t size);
DESCRIPTION
The alloca() function allocates size bytes of space in the stack frame of the caller. This
temporary space is automatically freed when the function that called alloca() returns
to its caller.
8. Memory Management
Defining and Creating secondary memory areas
Memory allocation & deallocation system calls malloc,calloc, alloca, free
Demand Paging defined
●Process Organization in Memory
●Address Translation and page fault handling
●
●
●
Virtual Memory Management
●
Virtual Memory
Each program has a virtual address space which is the set of addresses that programs
use for load and store operations.
The physical address space is the set of addresses used to reference locations in main
memory.
The virtual address space is divided into pages some of which reside inside a page
frame ( slots in main memory ) while others reside on the disk. Pages are always
aligned on a multiple of the page size so that the addresses never overlap.
The terms virtual page and physical page are used to describe a page of data in the
virtual and physical address spaces respectively.
Pages that have been loaded into main memory are said to have been mapped.
Virtual memory allows a computer to act as if its main memory were much larger
than it actually is.
When a program references a virtual address, it cannot tell, except by timing the
latency of the operation, whether the virtual address was resident in the main memory
or whether it had to be fetched from disk.
This makes it possible for the computer to shuffle pages in and out of the main
memory exactly like data is brought in and out of the cache.
8. Memory Management
Defining and Creating secondary memory areas
Memory allocation & deallocation system calls malloc,calloc, alloca, free
Demand Paging defined
●Process Organization in Memory
●
●
●
Address Translation and page fault
handling
●
Virtual Memory Management
●
Address Translation
Programs running on systems with Virtual Memory use Virtual Addresses as the
arguments to load and store instructions.
The main memory uses Physical Addresses to record locations where data is actually
stored.
Whenever a program uses a Virtual Address, this must be converted into a Physical
Address and this process is known as Address Translation.
When a program accesses a memory location, the O.S accesses a Page Table, which is a
data structure that contains the mapping of the virtual address to the physical address.
If the virtual page is mapped ( present in memory ) then the physical address is
retrieved and the operation proceeds.
If the virtual page is NOT mapped, then a page fault occurs and the O.S fetches the
page from the hard disk, loading it into a page frame, and updating the page table with
the new translation. Once the page has been read into memory from disk, and the
page table updated, the physical address of the page can be determined and the
memory reference completed.
If all the page frames already contain data, one of them must be evicted to the disk to
make room for the incoming data. The replacement policies used to select the page
that is evicted are similar to the ones for set-associative caches.
Because both virtual and physical pages are always aligned on a multiple of their size,
the page table does not need to keep track of the full virtual or physical address of a
page that is mapped. Instead virtual addresses are divided into a Virtual Page Number
or VPN and a set of bits that describe an offset from the start of the virtual page to the
virtual address. Similarly, the physical pages are divided into Physical Page Numbers or
PPN and an offset
Because both virtual and physical pages are always aligned on a multiple of their size,
the page table does not need to keep track of the full virtual or physical address of a
page that is mapped. Instead virtual addresses are divided into a Virtual Page Number
or VPN and a set of bits that describe an offset from the start of the virtual page to the
virtual address. Similarly, the physical pages are divided into Physical Page Numbers or
PPN and an offset from the start of the physical page to the physical address.
The virtual and physical pages in a given system are generally the same size, so the
number of bits
(log 2 of the page size) for the offset of the virtual and physical addresses are the same.
The VPN and PPN may be of different lengths. For example, on 64-bit systems, the
virtual addresses are generally much longer than physical addresses.
The page table is accessed using the virtual page frame number as an offset.
Virtual page frame 5 would be the 6th element of the table (0 is the first element).
To translate a virtual address into a physical one, the processor must first work out
the virtual addresses page frame number and the offset within that virtual page. By
making the page size a power of 2 this can be easily done by masking and shifting.
Assuming a page size of 0x2000 bytes (which is decimal 8192) and an address of
0x2194 in process Y’s virtual address space then the processor would translate that
address into offset 0x194 into virtual page frame number 1.
Alpha AXP Page Table Entry
V Valid, if set this PTE is valid,
FOE “Fault on Execute”, Whenever an attempt to execute instructions in this page
occurs, the processor reports a page fault and passes control to the operating system,
FOW “Fault on Write”, as above but page fault on an attempt to write to this page,
FOR “Fault on Read”, as above but page fault on an attempt to read from this page,
ASM Address Space Match. This is used when the operating system wishes to clear
only some of the entries from the Translation Buffer,
KRE Code running in kernel mode can read this page,
URE Code running in user mode can read this page,
GH Granularity hint used when mapping an entire block with a single Translation
Buffer entry rather than many,
KWE Code running in kernel mode can write to this page,
UWE Code running in user mode can write to this page,
page frame number For PTEs with the V bit set, this field contains the physical
Page Frame Number (page frame number) for this PTE. For invalid PTEs, if
this field is not zero, it contains information about where the page is in the
swap file.
The following two bits are defined and used by Linux:
PAGE DIRTY if set, the page needs to be written out to the swap file,
PAGE ACCESSED Used by Linux to mark a page as having been accessed.
TLB, Translation Lookaside Buffers
A major disadvantage of using page tables is that a page table must be accessed for
every memory reference. On a system with a single-level page table, this doubles the
number of memory accesses, since each load or store operation requires one memory
reference to access the appropriate page table and one to perform the actual
load/store. This greatly increases the latency of a memory reference.
The problem is even greater on multi-level page tables, because multiple references are
required to traverse the page table. To reduce penalty, CPUs that incorporate virtual
memory use Translation Looaside Buffers ( TLBs) that act as caches for the page table.
Whenever a program performs a memory reference the virtual address is sent
to the TLB to determine if it contains a translation for that address. If so, the TLB
returns the physical address and the memory reference continues.
If not, a TLB miss occurs and the system searches the page table for a translation. Some
systems provide hardwaresupport for a TLB miss while others require the OS to access
the page table thru software.
TLB misses versus Page Faults
In a system that supports TLBs, 3 possible cases exist:
1. Hit in the TLB : The TLB contains the physical address and it is returned
immediately.
2. TLB miss, but page mapped : In this case the system accesses the page table from
memory to find the translation for the virtual address, copies that translation into TLB
returns the memory reference
3. TLB miss and page not mapped: The system accesses the page table and finds that its
is not mapped. This results in a page fault. The O.S loads the page’s data from disk in
the same manner as a virtual memory system that does not contain TLB.
TLB misses and page faults are handled very differently by the O.S because of the
difference in the amount of time it takes to resolve each event.
TLB misses generally take a short time to resolve if the page is mapped and normally
takes a few hundred cycles so user programs can just wait for its completion.
TLB misses that result in a page fault can take a few milliseconds which is the amount
of time slice generally given to a process. Therefore, a page fault can trigger a context
switch through invoking the scheduler while the page fault is being resolved.
TLB Entry
TLBs are organized similar to caches having an associativity and number of sets. While
cache sizes are typically described in bytes, TLBs are in number of entres or
translations contained in them, since the amount of space taken up by each entry is
mostly irrelevant to the performance of the system.
This a 128-entry, 4-way set-associative TLB would have 32 sets each containing 4
entries.
The TLB entry contains the VPN of the page that it is a translation for, which is
compared to the VPN of the address of a memory reference to determine if a hit has
occurred.
Like a cache’s tag array entry, bits of the VPN used to select an entry in the TLB are
omitted to save space. All the bits of the PPN are stored however, since they may differ
from the corresponding bits in the VPN.
8. Memory Management
Defining and Creating secondary memory areas
Memory allocation & deallocation system calls malloc,calloc, alloca, free
●
●
Demand Paging defined
●
Process Organization in Memory
Address Translation and page fault handling
●Virtual Memory Management
●
●
Demand Paging
As there is much less physical memory than virtual memory the operating system
must be careful that it does not use the physical memory inefficiently. One way to
save physical memory is to only load virtual pages that are currently being used by
the executing program.
This technique of only loading virtual pages into memoryas they are accessed is
known as demand paging.
When a process attempts to access a virtual address that is not currently in memory
the processor cannot find a page table entry for the virtual page referenced. For
Example in previous figure there is no entry in process X’s page table for virtual page
frame number 2 and so if process X attempts to read from an address within virtual
page frame number 2 the processor cannot translate the address into a physical
one. At this point the processor notifies the operating system that a page fault has
Occurred.
If the faulting virtual address is invalid this means that the process has attempted
to access a virtual address that it should not have. Maybe the application has gone
wrong in some way, for example writing to random addresses in memory. In this case
the operating system will terminate it, protecting the other processes in the system
from this rogue process.
If the faulting virtual address was valid but the page that it refers to is not currently
in memory, the operating system must bring the appropriate page into memory from
the image on disk.
The fetched page is written into a free physical page frame and an entry for the
virtual page frame number is added to the processes page table. The process is then
restarted at the machine instruction where the memory fault occurred. This time
the virtual memory access is made, the processor can make the virtual to physical
address translation and so the process continues to run.
Linux uses demand paging to load executable images into a processes virtual memory.
Whenever a command is executed, the file containing it is opened and its contents
are mapped into the processes virtual memory. This is done by modifying the data
structures describing this processes memory map and is known as memory mapping.
However, only the first part of the image is actually brought into physical memory.
The rest of the image is left on disk. As the image executes, it generates page faults
and Linux uses the processes memory map in order to determine which parts of the
image to bring into memory for execution.
8. Memory Management
Defining and Creating secondary memory areas
Memory allocation & deallocation system calls malloc,calloc, alloca, free
Demand Paging defined
●
●
●
Process Organization in Memory
●
Address Translation and page fault handling
Virtual Memory Management
●
●
Process Address Space
Day 4 - Morning
9. Multi Thread Programming
●Creating multiple threads
Parent synchronization with other Thread
●
Introduction
• Thread is a sequential flow of control through a program.
• If a process is defined as a program in execution then a thread is defined as a
function in execution.
• If a thread is created, it will execute a specified function.
• Two type of threading:
– Single Threading
– Multi threading
POSIX Thread
The created threads within a process share
instructions of a process
process address space and data
open file descriptors
pwd, uid and gid
The created threads maintain its own:
thread identification number (tid)
pc, sp, set of registers
stack
Signal Handlers priority of the threads
scheduling policy
Advantages of Threads:
Takes less time for:
• Creation of a new thread
• Termination of a thread
• Communication between threads are easier.
There are two broad categories of thread
implementation:
1. User level Threads (ULT)
2. Kernel level threads (or kernel-supported
threads or Light weight processes)
Thread management
Thread management is done by the application and the kernel is not aware of the
existence of threads.
• Thread library contains code for creating and destroying threads, passing messages
and data between threads, for scheduling thread execution and for saving and restoring
thread contexts.
• This thread application are allocated to a single process managed by the kernel.
• All the activity takes place in user space and within a single process. The kernel
continues to schedule the process as a unit and assigns a single execution state to that
process.
ULT
Advantages:
• Thread switching does not require kernel mode.
• Scheduling can be application specific.
• Can run on any OS.
Disadvantages:
• When it executes a system call, not only is that thread is
blocked, but all the threads within the process are blocked.
KLT
Kernel Level Threads:
• Thread management is done by the kernel
– Advantage: If one thread in a process is blocked, kernel can
schedule another thread of the same process.
– Disadvantage: Transfer of control from one thread to
another within the same process requires a mode switch to
the kernel
Advantages of Multi Threading
Improve application responsiveness
Use multiprocessors more efficiently
Improve program structure
use fewer system resources
Specific applications in uniprocessor machines
Applications
A file server on a LAN
Graphical User Interfaces (GUIs)
web applications
9. Multi Thread Programming
Creating multiple threads
●
Parent synchronization with other Thread
●
Parent wait on join() system call to let children join them
Hello Thread Example
#include <pthread.h>
void thread_function (void) {
printf (“ Hello POSIX Thread\n”);
printf (“Thread id: %d\n”, pthread_self());
}
main ( ) {
pthread_t mythread;
pthread_create ( &mythread, NULL, thread_function, NULL);
pthread_join (mythread, NULL);
}
$cc thread.c -lpthread
9. Multi Thread Programming
Creating multiple threads
Parent synchronization with other Thread
●
●
System calls
●
1. pthread_create
#include <pthread.h>
int pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict
attr, void *(*start_routine)(void*), void *restrict arg);
The pthread_create() function shall create a new thread, with attributes specified
by attr, within a process. If attr is NULL, the default attributes shall be used. If the
attributes specified by attr are modified later, the thread’s attributes shall not be
affected. Upon successful completion, pthread_create() shall store the ID of the
created thread in the location referenced by thread.
#include<stdio.h>
#include<unistd.h>
#include<stdlib.h>
#include<pthread.h>
#include<string.h>
void *thread_fun(void *arg);
char message[]="hello world";
int main()
{
int res;
pthread_t a_thread;
void *thread_result;
res=pthread_create(&a_thread,NULL,thread_fun,(void *)message);
if(res !=0){
perror("unable to create thread\n");
exit(1);
}
printf("waiting for thread to finish\n");
//Thread joining, catch exit value from the thread
res=pthread_join(a_thread,&thread_result);
if(res !=0){
perror("unable to join thread\n");
exit(1);
}
}
printf("thread joined , it returned %s\n",(char *)thread_result);
printf("Message is now %s\n",message);
exit(0);
void *thread_fun(void *arg)
{
printf("thread fun ,arg is %s\n",(char *)arg);
sleep(3);
strcpy(message,"bye");
//exit with return value
pthread_exit("thank you");
}
2. pthread_key_create
#include <pthread.h>
int pthread_key_create(pthread_key_t *key, void (*destructor)(void*));
pthread_key_create - thread-specific data key creation
The pthread_key_create() function shall create a thread-specific data
key visible to all threads in the process. Key values provided by
pthread_key_create() are opaque objects used to locate thread-specific data.
Although the same key value may be used by different threads, the values bound
to the key by pthread_setspecific() are maintained on a per-thread basis and
persist for the life of the calling thread.
Upon key creation, the value NULL shall be associated with the new key in
all active threads. Upon thread creation, the value NULL shall be associated with
all defined keys in the new thread.
#include <malloc.h>
#include <pthread.h>
#include <stdio.h>
#include<stdlib.h>
#include<unistd.h>
/* The key used to associate a log file pointer with each thread. */
static pthread_key_t thread_log_key;
/* Write MESSAGE to the log file for the current thread. */
void write_to_thread_log (const char* message)
{
FILE* thread_log = (FILE*) pthread_getspecific (thread_log_key);
fprintf (thread_log, "%s\n", message);
}
/* Close the log file pointer THREAD_LOG. */
void close_thread_log (void* thread_log)
{
fclose ((FILE*) thread_log);
}
void* thread_function (void* args)
{
char thread_log_filename[20];
FILE* thread_log;
/* Generate the filename for this thread's log file. */
sprintf (thread_log_filename, "thread%d.log", (int) pthread_self ());
/* Open the log file. */
thread_log = fopen (thread_log_filename, "w");
/* Store the file pointer in thread-specific data under thread_log_key. */
pthread_setspecific (thread_log_key, thread_log);
write_to_thread_log ("Thread starting.");
/* Do work here... */
return NULL;
}
main ()
{
int i;
pthread_t threads[5];
/* Create a key to associate thread log file pointers in
thread-specific data. Use close_thread_log to clean up the file
pointers. */
pthread_key_create (&thread_log_key, close_thread_log);
/* Create threads to do the work. */
for (i = 0; i < 5; ++i)
pthread_create (&(threads[i]), NULL, thread_function, NULL);
/* Wait for all threads to finish. */
for (i = 0; i < 5; ++i)
pthread_join (threads[i], NULL);
return 0;
}
3. pthread_mutex_init
#include <pthread.h>
int pthread_mutex_destroy(pthread_mutex_t *mutex);
int pthread_mutex_init(pthread_mutex_t *restrict mutex,
const pthread_mutexattr_t *restrict attr);
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
The pthread_mutex_destroy() function shall destroy the mutex object referenced
by mutex; the mutex object becomes, in effect, uninitialized. An
implementation may cause pthread_mutex_destroy() to set the object
referenced by mutex to an invalid value. A destroyed mutex object can be
reinitialized using pthread_mutex_init(); the results of oth erwise referencing the
object after it has been destroyed are undefined.
It shall be safe to destroy an initialized mutex that is unlocked. Attempting to
destroy a locked mutex results in undefined behavior.
The pthread_mutex_init() function shall initialize the mutex referenced by
mutex with attributes specified by attr. If attr is NULL, the default mutex attributes
are used; the effect shall be the same as passing the address of a default mutex
attributes object. Upon successful initialization, the state of the mutex becomes
initialized and unlocked.
#include<stdio.h>
#include<unistd.h>
#include<stdlib.h>
#include<string.h>
#include<pthread.h>
#include<semaphore.h>
void *thread_fun(void *arg);
pthread_mutex_t work_mutex;
char work_area[1024];
int time_to_exit=0;
int main()
{
int res;
pthread_t a_thread;
void *thread_result;
enjoy
res=pthread_mutex_init(&work_mutex,NULL);//initialize mutex default attr
res=pthread_create(&a_thread,NULL,thread_fun,NULL);
pthread_mutex_lock(&work_mutex); //put a lock to the main thread, then
printf("input some text enter end to finish\n");
while(!time_to_exit)
{
fgets(work_area,1024,stdin);
//unlock the main thread,your subordinate is waiting
pthread_mutex_unlock(&work_mutex);
while(1){
pthread_mutex_lock(&work_mutex);//lock it is your turn
if(work_area[0] != '\0') {
pthread_mutex_unlock(&work_mutex);
sleep(1);
}
else
break;
}
}
pthread_mutex_unlock(&work_mutex);
printf("waiting for thread to finish\n");
res=pthread_join(a_thread,&thread_result);
printf("thread joined , it returned %s\n",(char *)thread_result);
pthread_mutex_destroy(&work_mutex);
exit(0);
}
void *thread_fun(void *arg)
{
sleep(1);//Sleep well Let main thread send some data
pthread_mutex_lock(&work_mutex);//lock the curr thread
while(strncmp("end",work_area,3) !=0)
{
printf("you entered %d characters \n",strlen(work_area) -1);
work_area[0]='\0';
pthread_mutex_unlock(&work_mutex);//unlock the current thread
sleep(1);//Sleep well , Let main thread do it's job
pthread_mutex_lock(&work_mutex);
while(work_area[0] == '\0')
{
pthread_mutex_unlock(&work_mutex);
sleep(1);
pthread_mutex_lock(&work_mutex);
}
}
time_to_exit=1;
work_area[0]='\0';
pthread_mutex_unlock(&work_mutex);
pthread_exit("thank you");
}//End of the function
Day 4 - Morning
10. Inter process communication
●Pipes
Fifo's
Signals
●System-V IPC's
●
Message queues ●
Shared memory
●
- Semaphores
●
●
Persistence of various ipcs
Un named Pipe or Pipe
On command line pipe is represented as “|”
• It can be used in the shell to link two or more commands
– For example ls –Rl | wc
• Two ends of a pipe is represented as a set of two descriptors.
• A pipe is used to communicate between related Processes (common ancestor).
Normally, a pipe is created by a process, that process calls fork, and the pipe is
used between the parent and the child.
• Half duplex
• Data is passed in order.
• Pipe uses circular buffer and it has zero buffering capacity
• The read and write system calls are blocking calls.
#include <unistd.h>
int fd[2];
int pipe(int fd[2]);
One Way Communication between parent and child
Create a pipe.
• Call fork.
• Parent can send data and child can read the data or vice versa.
• Unused ends (descriptors) should be closed.
parent closes the read end of the pipe (fd[0]), and the child closes the write end (fd[1]).
Figure 15.4 shows the resulting arrangement of descriptors.
Two way Communication
• Create two pipes say fd1, fd2.
• Four descriptors for each process (fd1[0], fd1[1], fd2[0], fd2[1]).
• Parent closes read end of fd1 and write end of fd2
– close(fd1[0], fd2[1]);
• child closes read end of fd2 and write end of fd1
– close(fd2[0], fd1[1]);
Pipe : Advantages & Disadvantages
Advantages:
• Simplest form of IPC
• Persistence in process level
• Can be used in shell
Disadvantages:
• Cannot be used to communicate between unrelated processes
popen and pclose Functions
The function popen does a fork and exec to execute the cmdstring and returns a
standard I/O file pointer. If type is "r", the file pointer is connected to the standard
output of cmdstring.
If type is "w", the file pointer is connected to the standard input of cmdstring.
#include <stdio.h>
FILE *popen(const char *cmdstring, const char *type);
int pclose(FILE *fp);
Returns: file pointer if OK, NULL on error
Returns: termination status of cmdstring, or −1 on error
Result of fp = popen(cmdstring, "r")
Result of fp = popen(cmdstring, "w")
SIMPLEX PIPE
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
main()
{
int
pipefd[2],
char
buff[100];
n;
if(pipe(pipefd) < 0) //create a pipe with two descriptors
perror("filed in openning pipe\n");
printf("read fd = %d, write fd = %d\n",pipefd[0],pipefd[1]);
//write into the pipe's write decriptor
if(write(pipefd[1],"hello world.....!",18)!= 18)
perror("filed in writing pipe\n");
//read from the pipe's read decriptor
if((n = read(pipefd[0],buff,sizeof(buff))) < 0)
perror("filed in writing pipe\n");
write(1 , buff, n);
exit(0);
//write to the stdout
}
DUPLEX PIPE
#include <stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include <string.h>
# define MAXBUF 1024
void client(int readfd, int writefd) {
char
buff[MAXBUF];
int
n;
puts("Enter file name\n");
scanf("%s",buff);
n = strlen(buff);
if(buff[n-1] == '\n') n--;
if(write(writefd,buff, n) !=n) perror("client: write error\n");
while((n = read(readfd,buff,MAXBUF)) > 0){
if(write(1,buff,n)!= n)
perror("client: error\n");
}
if(n < 0)
perror("Client: write error\n");
}
void server(int readfd,int writefd) {
char
buff[MAXBUF];
int
n, fd;
if((n = read(readfd, buff, MAXBUF)) <= 0) perror("server: read error\n");
buff[n] = '\0';
if((fd = open(buff,0)) < 0) perror("server:open error\n");
while((n = read(fd,buff,MAXBUF)) > 0) if(write(writefd,buff,n)!= n)
perror("server: write error\n");
if(n < 0) perror("server : read error\n");
}
main() {
int
pipefd1[2], pipefd2[2], childfd, n;
char
buff[100];
if(pipe(pipefd1) < 0 || pipe(pipefd2) < 0) perror("filed in openning pipes\n");
if((childfd = fork()) < 0){
perror("can't fork");
close(pipefd1[0]);
close(pipefd1[1]);
close(pipefd2[0]);
close(pipefd2[1]);
}
else if(childfd > 0){
//Parent process
close(pipefd1[0]);
//read1
close(pipefd2[1]);
//write2
client(pipefd2[0],pipefd1[1]);
while(wait(( int *) 0)!= childfd);
close(pipefd1[1]);
close(pipefd2[0]);
} else {
// child process
close(pipefd1[1]);
// write1
close(pipefd2[0]);
// read2
server(pipefd1[0],pipefd2[1]);
close(pipefd1[0]);
close(pipefd2[1]);
}
exit(0);
}
10. Inter process communication
Pipes
●
Fifo's
●
Signals
System-V IPC's
●
Message queues ●
Shared memory
●
- Semaphores
●
●
FIFO: Introduction
• FIFO works much like a pipe
– Half duplex, data passed in FIFO order, circular buffer
and zero buffering capacity.
• FIFO is created on a file system as a device
special file
• It can be used to communicate between unrelated
processes
• It can be reused.
• Persist till the file is deleted.
FIFO Creation
• FIFO can be created in a shell by using mknod or
mkfifo command.
– mknod myfifo p
– mkfifo a=rw myfifo
• In a C program mknod system call or mkfifo library
function can be used.
– int mkfifo ( char *file_name, mode_t mode);
– int mknod (char *file_name, mode_t mode, dev_t dev);
• mknod(“./MYFIFO", S_IFIFO|0666, 0);
Using FIFO
• Once a FIFO is created either from a shell or
through a program, file’s related system calls
(open, read, write, select, close etc., ) are used to
access the FIFO.
• For example: Process 1 may open a FIFO in write
only mode and write some data.
• Process 2 may open the FIFO in read only mode,
read the data and display on the monitor.
FIFO: Disadvantages
• Data cannot be broadcast to multiple receivers.
• If there are multiple receivers, there is no way to direct
to a specific reader or vice versa.
• Cannot be used across network
• Less secure than a pipe, since any process with valid
access permission can access data.
• Cannot store data
• No message boundaries. Data is treated as a stream of
Bytes.
FIFO: Disadvantages
• Data cannot be broadcast to multiple receivers.
• If there are multiple receivers, there is no way to direct
to a specific reader or vice versa.
• Cannot be used across network
• Less secure than a pipe, since any process with valid
access permission can access data.
• Cannot store data
• No message boundaries. Data is treated as a stream of
Bytes.
#include <stdio.h>
#include<stdio.h>
#include<stdlib.h>
#include <string.h>
# define
# define
FIFO1
FIFO2
"/tmp/fifo1"
"/tmp/fifo2"
//fifos can be created in users home
//directory also.
# define MAXBUF 1024
void client(int readfd, int writefd) {
char
buff[MAXBUF];
int
n;
puts("Enter file name\n");
scanf("%s",buff);
//reading file name
n = strlen(buff);
if(buff[n-1] == '\n') n--;
if(write(writefd,buff, n) !=n) //writing file name into fifo
perror("client: write error\n");
while((n = read(readfd,buff,MAXBUF)) > 0) if(write(1,buff,n)!= n) perror("client:
error\n");
if(n < 0) perror("Client: write error\n");
}
void server(int readfd,int writefd) {
char
buff[MAXBUF];
int
n, fd;
if((n = read(readfd, buff, MAXBUF)) <= 0) perror("server: read error\n");
buff[n] = '\0';
if((fd = open(buff,0)) < 0) perror("server:open error\n");
while((n = read(fd,buff,MAXBUF)) > 0) if(write(writefd,buff,n)!= n) perror("server: write
error\n");
if(n < 0) perror("server : read error\n");
}
main() {
int
readfd, writefd ,pid;
//fifo is created with user read and write //permission.
if((mkfifo(FIFO1, 0666)) < 0){
perror("Fifo1 failed\n");
exit(1);
}
if((mkfifo(FIFO2, 0666)) < 0){
perror("Fifo1 failed\n");
exit(2);
}
if((pid = fork()) == 0){
readfd = open(FIFO1, 0, 0);//child opens fifo1 for read
writefd = open(FIFO2, 1, 0);//child opens fifo2 for write
//child process calls server function
server(readfd, writefd);
exit(3);
}
writefd = open(FIFO1, 1, 0);
readfd = open(FIFO2, 0, 0);
//Parent becomes client process
client(readfd, writefd);
//parent wait till exit status returned is equal to pid(current child)
waitpid(pid,NULL,0);
close(readfd);
close(writefd);
unlink(FIFO1);
unlink(FIFO2);
exit(0);
}
//removing fifo from /tmp
10. Inter process communication
Pipes
Fifo's
●
●
Signals
●
System-V IPC's
●
Message queues ●
Shared memory
●
- Semaphores
●
The common communication channel between user space program and kernel is
given by the system calls.
But there is a different channel, that of the signals, used both between user processes
and from kernel to user process.
Sending Signals
A program can signal a different program using the kill() system call with prototype
int kill(pid_t pid, int sig);
This will send the signal with number sig to the process with process ID pid . Signal
numbers are small positive integers.
Receiving signals
typedef void (*sighandler_t)(int);
sighandler_t signal(int sig, sighandler_t handler);
Signal
Value
Action
Comment
──────────────────────────────────────────────────────────────────────
SIGHUP
1
Term
Hangup detected on controlling terminal
or death of controlling process
SIGINT
2
Term
Interrupt from keyboard
SIGQUIT
3
Core
Quit from keyboard
SIGILL
4
Core
Illegal Instruction
SIGABRT
6
Core
Abort signal from abort(3)
SIGFPE
8
Core
Floating point exception
SIGKILL
9
Term
Kill signal
SIGSEGV
11
Core
Invalid memory reference
SIGPIPE
13
Term
Broken pipe: write to pipe with no
readers
SIGALRM
14
Term
Timer signal from alarm(2)
SIGTERM
15
Term
Termination signal
SIGUSR1
30,10,16
Term
User-defined signal 1
SIGUSR2
31,12,17
Term
User-defined signal 2
SIGCHLD
20,17,18
Ign
Child stopped or terminated
SIGCONT
19,18,25
Cont
Continue if stopped
SIGSTOP
17,19,23
Stop
Stop process
SIGTSTP
18,20,24
Stop
Stop typed at terminal
SIGTTIN
21,21,26
Stop
Terminal input for background process
SIGTTOU
22,22,27
Stop
Terminal output for background process
The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.
ignals
not in the POSIX.1-1990 standard but described in
SUSv2 and POSIX.1-2001.
Signal
Value
Action
Comment
────────────────────────────────────────────────────────────────────
SIGBUS
10,7,10
Core
Bus error (bad memory access)
SIGPOLL
Term
Pollable event (Sys V).
Synonym for SIGIO
SIGPROF
27,27,29
Term
Profiling timer expired
SIGSYS
12,31,12
Core
Bad argument to routine (SVr4)
SIGTRAP
5
Core
Trace/breakpoint trap
SIGURG
16,23,21
Ign
Urgent condition on socket (4.2BSD)
SIGVTALRM
26,26,28
Term
Virtual alarm clock (4.2BSD)
SIGXCPU
24,24,30
Core
CPU time limit exceeded (4.2BSD)
SIGXFSZ
25,25,31
Core
File size limit exceeded (4.2BSD)
various other signals.
Signal
Value
Action
Comment
────────────────────────────────────────────────────────────────────
SIGIOT
6
Core
IOT trap. A synonym for SIGABRT
SIGEMT
7,-,7
Term
SIGSTKFLT
-,16,Term
Stack fault on coprocessor (unused)
SIGIO
23,29,22
Term
I/O now possible (4.2BSD)
SIGCLD
-,-,18
Ign
A synonym for SIGCHLD
SIGPWR
29,30,19
Term
Power failure (System V)
SIGINFO
29,-,A synonym for SIGPWR
SIGLOST
-,-,Term
File lock lost (unused)
SIGWINCH
28,28,20
Ign
Window resize signal (4.3BSD, Sun)
SIGUNUSED
-,31,Core
Synonymous with SIGSYS
Blocking signals
Each process has a list (bitmask) of currently blocked signals. When a signal is blocked, it
is not delivered (that is, no signal handling routine is called), but remains pending.
The sigprocmask() system call serves to change the list of blocked signals. See
sigprocmask(2).
The sigpending() system call reveals what signals are (blocked and) pending.
The sigsuspend() system call suspends the calling process until a specified signal is
received.
When a signal is blocked, it remains pending, even when otherwise the process would
ignore it.
wait and SIGCHLD
Whenever the child (it exits, crashes, traps, stops, continues), and in particular
when it dies, the parent is sent a SIGCHLD signal. If parent handles it then
The parent can use the system call wait() or waitpid() or so, there are a few variations,
to learn about the status of its stopped or deceased children. In the case of a deceased
child, as soon as a status has been reported, the zombie vanishes.
If the parent is not interested it can say so explicitly (before the fork) using
signal(SIGCHLD, SIG_IGN);
or
struct sigaction act;
act.sa_handler = something;
act.sa_flags = SA_NOCLDWAIT;
sigaction (SIGCHLD, &act, NULL);
and as a result it will not hear about deceased children, and children will not be
transformed into zombies. Default action for SIGCHLD is to ignore the signal but it
would create zombie child process.
Returning from a signal handler
When the program was interrupted by a signal, its status (including all integer and
floating point registers) was saved, to be restored just before execution continues at the
point of interruption.
This means that the return from the signal handler is more complicated than an
arbitrary procedure return – the saved state must be restored.
To this end, the kernel arranges that the return from the signal handler causes a jump
# include <stdio.h>
# include <signal.h>
# include <unistd.h>
void
sig_fun(int);
main() {
struct sigaction signalact;
signalact.sa_handler = sig_fun;
sigemptyset(&signalact.sa_mask);
signalact.sa_flags =0;
sigaction(SIGINT, &signalact, 0);
while(1){
printf("hello world\n");
sleep(1);
}
}
void
sig_fun(int signal) {
printf("Hi, I got signal: %d\n",signal);
}
SIGCHLD
# include <signal.h>
void sig_init(void);
main() {
unsigned int pid, i;
if((pid = fork()) == 0) sleep(1);
else {
signal(SIGCHLD,sig_init);
for(i=0;i < 1000000000;i++) ;
printf("parent exiting\n");
}
}
void sig_init(void)
{
printf("child terminated\n");
}
SIGUSER
#include<stdio.h>
#include<signal.h>
static void sighandler(int);
int main(void) {
int i,parentpid,childpid,status;
/*prepare the sighandler routine to catch SIGUSR1 and SIGUSR2 */
if(signal(SIGUSR1,sighandler)==SIG_ERR) printf("Parent:Unable to create
handler for SIGUSR1\n");
parentpid=getpid();
if((childpid=fork())==0)
{
kill(parentpid,SIGUSR1);/* raise the SIGUSR1 signal*/
printf("\nHi,child, I am here .............!\n\n");
if(signal(SIGUSR2,sighandler)==SIG_ERR) printf("Child:Unable to
create handler for SIGUSR2\n");
/*Child Process begins busy-wait for a signal*/
printf("child,waiting for singnal\n");
pause();
//sleep(4);
printf("child done %d\n",getpid());
}
else
{
kill(childpid,SIGUSR2);/* raise the SIGUSR2 signal*/
printf("Parent:waiting for child to terminate.....\n");
//
sleep(1);
wait(&status);/*Parent waiting for the child termination*/
//kill(parentpid,SIGTERM);/*Parent raising the SIGTERM signal*/
printf("parent done %d\n",getpid());
}
}
static void sighandler(int signo) {
switch(signo)
{
case SIGUSR1:/* Incoming SIGUSR1 signal*/
printf("Parent:Recieved SIGUSR1 \n");
break;
case SIGUSR2:/*Incoming SIGUSR2 signal*/
printf("Recieved SIGUSR2\n");
break;
default:
printf("This should not be printed\n");
}
return;
}
10. Inter process communication
Pipes
Fifo's
Signals
●
●
●
System-V IPC's
- Message queues
●
●
●
Shared memory
- Semaphores
Introduction
• Sys V IPC is implemented as a single unit.
• System V IPC Provides three mechanisms namely:
– Message Queues
– Shared Memory
– Semaphores
• Persist till explicitly delete or reboot the system.
Common Attributes
Each IPC objects has the following attributes.
key
id
Owner
Permission
Size
- Message queue – used-bytes, number of messages
- Shared memory – size, number of attach, status
- Semaphore – number of semaphores in a set
- The ipc_perm structure holds the common attributes of the resources.
System Limitations
$ ipcs -l
------ Shared Memory Limits -------max number of segments = 4096
max seg size (kbytes) = 32768
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1
------ Semaphore Limits -------max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
------ Messages: Limits -------max queues system wide = 16
max size of message (bytes) = 8192
default max size of queue (bytes) = 16384
Get a Key
• If we wish to communicate between different processes using an IPC resource, the
first step is to create a shared unique identifier.
• The simplest form of the identifier is a number—the system generates this number
dynamically for a given mechanism by using the ftok library function.
• But apart from the creator, other processes that want to communicate with the
creator process should agree to the key value.
• Syntax: key_t ftok (const char *filename, int id);
Get an id
The syntax for a get function is:
int xxxget (key_t key, int xxxflg);
(xxx may be msg or shm or sem)
If successful, returns to an identifier; otherwise -1 for error.
The key can be generated in three different ways
- from the ftok library function
- by choosing some static positive integer value
- by using the IPC_PRIVATE macro
flags commonly used with this function are IPC_CREAT and IPC_EXCL.
Control an Object
The syntax for the control function is:
int xxxctl (int xxxid, int cmd, struct xxxid_ds *buffer);
(xxx may be msg or shm or sem);
If successful, the xxxctl function returns zero, otherwise it returns -1.
The command argument may be
IPC_STAT
IPC_SET
IPC_RMID
Message Queues
• Message queue overcomes FIFO limitation like
storing data and setting message boundaries.
• Create a message queue
• Send message (s) to the queue
• Any process who has permission to access the
queue can retrieve message (s).
• Remove the message queue.
Each queue has the following msqid_ds structure associated with it:
struct msqid_ds {
struct ipc_perm msg_perm;
msgqnum_t msg_qnum; /*# of messages on queue */
msglen_t msg_qbytes; /*max # of bytes on queue */
pid_t msg_lspid; /*pid of last msgsnd() */
pid_t msg_lrpid; /*pid of last msgrcv() */
time_t msg_stime; /*last-msgsnd() time */
time_t msg_rtime; /*last-msgrcv() time */
time_t msg_ctime; /*last-change time */
.
.
.
};
msgget
• int msgget (key_t key, int msgflg);
• The first argument key can be passed from the return
value of the ftok function or made IPC_PRIVATE.
• To create a message queue, IPC_CREAT ORed with
access permission is set for the msgflg argument.
• Ex: msgid = msgget (key, IPC_CREAT | 0744);
msgid = msgget (key, 0);
msgsnd
• The syntax of the function is:
• int msgsnd (int msqid, structu msgbuf *msgp, size_t
msgsz, int msgflg);
• Arguments:
– message queue ID
- address of the structure.
- size of the message text
- message flag
• 0 or IPC_NOWAIT
struct mymesg {
long mtype;
/* positive message type */
char mtext[512]; /* message data, of length nbytes */
};
msgrcv
Syntax of the function is:
ssize_t msgrcv (int msqid, struct msgbuf *msgp, size_t msgsz, long msgtype, int msgflg);
msgtype argument is used to retrieve a particular
message.
0 -retrieve in FIFO order
+ve - retrieve the the exact value of the message type
–ve - first message or <= to the absolute value.
on success, msgrcv returns with the number of bytes actually copied into the message
text
Destroying a Message Queue
There are many ways:
• From command line, using one of the ways
– $ ipcrm msg msqid
– $ ipcrm –q msqid
– $ ipcrm –Q msgkey
• Using system call
– msgctl (msgid, IPC_RMID, 0);
Message Queue: Pseudo Code
key = ftok (“.“, ‘a‘);
msqid = msgget (key, IPC_CREAT|0666);
msgsnd (msqid, &struct, sizeof (struct), 0);
msgrcv (msqid, &struct, sizeof (struct), mtype, 0);
msgctl (msqid, IPC_RMID, NULL);
$ipcrm msg msqid
Limitations
• Message queues are effective if a small amount of data is transferred.
• Very expensive for large transfers.
• During message sending and receiving, the message is copied from user buffer into
kernel buffer and vice versa
• So each message transfer involves two data copy operations, which results in poor
performance of a system.
• A message in a queue can not be reused
Message send tests.c
#include<sys/ipc.h>
#include<sys/types.h>
#include<sys/msg.h>
#include<unistd.h>
#include<stdlib.h>
#include<stdio.h>
struct message
{
long mtype;
char mtext[50];
};
main()
{
struct message m1;
int msgid;
if((msgid=msgget(1,0666|IPC_CREAT))==-1) {
perror("msgget");
exit(1);
}
m1.mtype=getpid();
printf("Process id of the current process is:%ld\n",getpid());
printf("Enter the message you want to send to the queue\n");
fgets(m1.mtext,50,stdin);
if((n=msgsnd(msgid,&m1,50,0))==-1) {
perror("msgsnd");
exit(1);
}
printf("Message successfully sent\n");
}
Message receive testr.c
#include<sys/ipc.h>
#include<sys/types.h>
#include<sys/msg.h>
#include<unistd.h>
#include<stdlib.h>
#include<stdio.h>
struct message
{
long mtype;
char mtext[50];
};
main() {
struct message m1;
int msgid;
if((msgid=msgget(1,0666|IPC_CREAT))==-1) {
perror("msgget");
exit(1);
}
if(msgrcv(msgid,&m1,10,0,MSG_NOERROR)==-1) {
perror("msgsnd");
exit(1);
}
printf("Message received from the process whose pid is:%ld\n",m1.mtype);
printf("And the message is:%s\n",m1.mtext);
}
Message control testc.c
#include<sys/ipc.h>
#include<sys/types.h>
#include<sys/msg.h>
#include<unistd.h>
#include<stdlib.h>
#include<stdio.h>
main(){
int msgid;
if((msgid=msgget(1,0))==-1) {
perror("msgget");
exit(1);
}
if(msgctl(msgid,IPC_RMID,0)==-1) {
perror("msgctl");
exit(1);
}
printf("Message queue successfully deleted\n");
}
10. Inter process communication
Pipes
Fifo's
Signals
●
●
●
System-V IPC's
●
- Message queues
- Shared memory
●
- Semaphores
Shared Memory
• Very flexible and ease of use.
• Fastest IPC mechanisms
• shared memory is used to provide access to
Global variable
Shared libraries
Word processors
Multi-player gaming environment
Http daemons
Other programs written in languages like Perl, C etc.,
Shared Memory: Data Structures
The data structures used in shared memory are
• shmid_ds
• ipc_perm
• Shminfo
• shm_info
• shmid_kernel
ipc_perm Structure
struct ipc_perm
{
__key_t __key;
- Key
__uid_t uid
- Owner's user ID
__gid_t gid;
- Owner's group ID
__uid_t cuid;
- Creator's user ID
__gid_t cgid;
- Creator's group ID
unsigned short int mode;
- r/w permission unsigned short int
__seq;
- Sequence number
};
shmid_ds
struct shmid_ds
{
struct ipc_perm shm_perm;
size_t shm_segsz;
__time_t shm_atime;
__time_t shm_dtime;
__time_t shm_ctime;
__pid_t shm_cpid;
__pid_t shm_lpid;
shmatt_t shm_nattch;
};
Steps to Access Shared Memory
The steps involved are:
• Creating shared memory
• Connecting to the memory & obtaining a pointer
to the memory
• Reading/Writing & changing access mode to the
memory
• Detaching from memory
• Deleting the shared segment
shmat
• Used to attach the created shared memory segment
onto a process address space.
• void *shmat(int shmid,void *shmaddr,int shmflg)
• Example: data=shmat(shmid,(void *)0,0);
• A pointer is returned on the successful execution of
the system call and the process can read or write to
the segment using the pointer.
Reading / Writing to Shared Memory
• Reading or writing to a shared memory is the easiest
part.
• The data is written on to the shared memory as we do it
with normal memory using the pointers
• Eg. Read:
printf(“SHM contents : %s \n”, data);
• Eg. Write:
prinf(“”Enter a String : ”);
scanf(“ %[^\n]”,data);
shmdt and shmctl
• The detachment of an attached shared memory segment is done by shmdt to pass
the address of the pointer as an argument.
• Syntax: int shmdt(void *shmaddr);
• To remove shared memory call:
int shmctl(shmid,IPC_RMID,NULL);
• These functions return –1 on error and 0 on successful execution.
Shared Memory: Pseudo Code
• shmid = shmget (key, 1024,
IPC_CREAT|0744);
• void *shmat (int shmid, void *shmaddr, int shmflg);
if the shm is read only pass SHM_RDONLY else 0
• (void *)data = shmat (shmid, (void *)0, 0);
• int shmdt (void *shmaddr);
• int shmctl (shmid, IPC_RMID, NULL);
Limitations
• Data can either be read or written only. Append is not allowed.
• Race condition
– Since many processes can access the shared memory,
any modification done by one process in the address space is visible to all other
processes. Since the address space is a shared resource, the developer should implement
a proper locking mechanism to prevent the race condition in the shared memory.
Shared memory create
#include<sys/ipc.h>
#include<sys/shm.h>
#include<stdio.h>
#include <stdlib.h>
#include <string.h>
main()
{
int shmid,pos;
char *msg;
if((shmid=shmget(110,1024,IPC_CREAT|0666))==-1) {
perror("shmget");
exit(1);
}
msg=shmat(shmid,0,0);
printf("Enter the data you want to write into shared memory\n");
fgets(msg,1024,stdin);
pos = strlen(msg);
strcpy(msg+pos-1,"World");
printf("Data successfully written\n");
}
shmdt(msg);
Shared memory read
#include<sys/ipc.h>
#include<sys/shm.h>
#include<stdio.h>
#include <stdlib.h>
#include <string.h>
main()
{
int shmid;
char *msg;
if((shmid=shmget(110,1024,0666|IPC_CREAT))==-1) {
perror("shmget"); //get shrdmry id
exit(1);
}
msg=shmat(shmid,0,0);
printf("Data written in the shared memory is:%s\n",msg);
}
shmdt(msg); //to detach the memory location for further use
Shared memory contorl
#include<sys/ipc.h>
#include<sys/shm.h>
#include<stdio.h>
#include <stdlib.h>
main()
{
int shmid;
if((shmid=shmget(110,0,0))==-1)
{
perror("shmid");
exit(1);
}
//110 is key
if(shmctl(shmid,IPC_RMID,0)==-1) {
perror("shmctl");
exit(1);
}
}
printf("Shared memory successfully removed\n");
10. Inter process communication
Pipes
Fifo's
Signals
●
●
●
System-V IPC's
●
- Message queues
- Shared memory
- Semaphores
Semaphores
• If a process wants to use the shared object, it will “lock” it by asking the semaphore
to decrement the counter
• Depending upon the current value of the counter, the semaphore will either be able
to carry out this operation, or will have to wait until the operation becomes possible
• The current value of counter is >0, the decrement operation will be possible.
Otherwise, the process will have to wait
System V IPC: Semaphores
• System V semaphore provides a semaphore set
- that can include a number of semaphores. It is up to user to decide the number of
semaphores in the set.
• Each semaphore in the set can be a binary or a counting semaphore. Each
semaphore can be used to control access to one resource – by changing the value of
semaphore count.
Semaphore: Initialization
union semun {
int val;
// value for SETVAL
struct semid_ds *buf; // buffer for
unsigned short int *array; // array
};
union semun arg;
semid = semget (key, 1, IPC_CREAT |
arg.val = 1; /* 1 for binary else >
semctl (semid, 0, SETVAL, arg);
IPC_STAT, IPC_SET
for GETALL, SETALL
0644);
1 for Counting Semaphore */
Semaphore: Implementation
struct sembuf {
short sem_num; /* semaphore number: 0 means first */
short sem_op; /* semaphore operation: lock or unlock */
short sem_flg; /* operation flags : 0, SEM_UNDO, IPC_NOWAIT */
};
struct sembuf buf = {0, -1, 0}; /* (-1 + previous value) */
semid = semget (key, 1, 0);
semop (semid, &buf, 1); /* locked */
-----Critical section-------buf.sem_op = 1;
semop (semid, &buf, 1); /* unlocked */
# include <sys/types.h>
# include <sys/sem.h>
# include <sys/ipc.h>
# include <stdio.h>
# include<pthread.h>
# include<unistd.h>
union semun{
int val;
struct semid_ds
*buf;
unsigned short array;
struct seminfo
*__buff;
};
void * th_fun(void *);
union semun u;
int
sid;
key_t key;
int
pid,
sid;
struct sembuf su, sl;
main()
{
pthread_t
t1, t2, t3, t4;
unsigned short int key;
key = ftok("semaphore.c",100);
sid = semget(key,1,IPC_CREAT | 0666);
printf("semaphore created by %d\n",getpid());
u.val = 2;
semctl(sid,0,SETVAL,u);
printf("Semaphore initialized to %d\n",u.val);
pid = getpid();
sl.sem_num = 0;
sl.sem_op = -1;
sl.sem_flg = SEM_UNDO ;
su = sl;
su.sem_op = 1;
pthread_create(&t1,
pthread_create(&t2,
pthread_create(&t3,
pthread_create(&t4,
NULL,
NULL,
NULL,
NULL,
th_fun,"Thread
th_fun,"Thread
th_fun,"Thread
th_fun,"Thread
pthread_join(t1,NULL);
pthread_join(t2,NULL);
pthread_join(t3,NULL);
pthread_join(t4,NULL);
//semctl(sid,0,IPC_RMID);
printf("Semaphore removed\n");
}
One");
two");
three");
four");
void * th_fun(void *p)
{
char
*str;
int i = 0;
str = (char * )p;
printf("%s is Trying to lock semaphore %d\n\n",str, pid);
if(semop(sid,&sl,1) == 0)
printf("%s
Succedd in LOck
%d\n\n",str,pid);
}
while(++i < 3)
{
printf("%s
Resourec use here %d\n\n",str,pid);
sleep(6);
}
semop(sid,&su,1);
printf("%s Unlock and Bye
%d\n\n",str,pid);
Day 5 ­ Morning
11. Sockets
●An Overview
System calls related to
TCP
UDP socket
●
●
●
A socket is an abstraction of a communication endpoint. Just as they would use file
descriptors to access files, applications use socket descriptors to access sockets. Socket
descriptors are implemented as file descriptors in the UNIX System. Indeed, many of
the functions that deal with file descriptors, such as read and write, will work with a
socket descriptor.
To create a socket, we call the socket function.
#include <sys/socket.h>
int socket(int domain, int type, int protocol);
Returns: file (socket) descriptor if OK, −1 on error
Domain
Type
Protocol
socket() call is similar to open() system call.
slose - deallocates the socket
dup, dup2 - duplicates the file descriptor as normal
fchdir - fails with errno set to ENOTDIR
fchmod - unspecified
fchown - implementation defined
fcntl -some commands supported, including F_DUPFD, F_DUPFD_CLOEXEC,
F_GETFD, F_GETFL, F_GETOWN, F_SETFD, F_SETFL, and F_SETOWN
fdatasync, fsync - implementation defined
fstat - some stat structure members supported, but how left up to the implementation
ftruncate - unspecified
ioctl - some commands work, depending on underlying device driver
lseek - implementation defined (usually fails with errno set to ESPIPE)
mmap - unspecified
poll - works as expected
pread and pwrite - fails with errno set to ESPIPE
read and readv - equivalent to recv without any flags
select - works as expected
write and writev - equivalent to send without any flags
#include <sys/socket.h>
int shutdown(int sockfd, int how);
If how is SHUT_RD, then reading from the socket is disabled. If how is SHUT_WR,
then we can’t use the socket for transmitting data. We can use SHUT_RDWR to
disable both data transmission and reception.
Given that we can close a socket, why is shutdown needed? There are several
reasons. First, close will deallocate the network endpoint only when the last active
reference is closed. If we duplicate the socket (with dup, for example), the socket won’t
be deallocated until we close the last file descriptor referring to it. The shutdown
function allows us to deactivate a socket independently of the number of active file
descriptors referencing it. Second, it is sometimes convenient to shut a socket down in
one direction only. For example, we can shut a socket down for writing if we want the
process we are communicating with to be able to tell when we are done transmitting
data, while still allowing us to use the socket to receive data sent to us by the process.
Byte Ordering
The TCP/IP protocol suite uses big-endian byte order.
#include <arpa/inet.h>
uint32_t htonl(uint32_t hostint32);
Returns: 32-bit integer in network byte order
uint16_t htons(uint16_t hostint16);
Returns: 16-bit integer in network byte order
uint32_t ntohl(uint32_t netint32);
Returns: 32-bit integer in host byte order
uint16_t ntohs(uint16_t netint16);
Returns: 16-bit integer in host byte order
struct sockaddr_in {
sa_family_t sin_family; /* address family */
in_port_t sin_port; /* port number */
struct in_addr sin_addr; /* IPv4 address */
};
inet_ntop – network to presentation
#include <arpa/inet.h>
const char *inet_ntop(int domain, const void *restrict addr,
char *restrict str, socklen_t size);
Returns: pointer to address string on success, NULL on error
int inet_pton(int domain, const char *restrict str,
void *restrict addr);
Returns: 1 on success, 0 if the format is invalid, or −1 on error
Address Look Up
To iterate or set the network configuration on the machine
#include <netdb.h>
struct hostent *gethostent(void);
Returns: pointer if OK, NULL on error
void sethostent(int stayopen);
void endhostent(void);
struct hostent {
char *h_name;
char **h_aliases;
int h_addrtype;
int h_length;
char **h_addr_list;
.
};
DNS
gethostbyname and gethostbyaddr() are obselete against following api
#include <netdb.h>
struct netent *getnetbyaddr(uint32_t net, int type);
struct netent *getnetbyname(const char *name);
struct netent *getnetent(void);
All return: pointer if OK, NULL on error
void setnetent(int stayopen);
void endnetent(void);
The netent structure contains at least the following fields:
struct netent {
char n_name; /*network name */
char **n_aliases; /*alternate network name array pointer */
int n_addrtype; /*address type */
uint32_t n_net; /*network number */
.
};
We can map between protocol names and numbers with the following functions.
#include <netdb.h>
struct protoent *getprotobyname(const char *name);
struct protoent *getprotobynumber(int proto);
struct protoent *getprotoent(void);
All return: pointer if OK, NULL on error
void setprotoent(int stayopen);
void endprotoent(void);
The protoent structure as defined by POSIX.1 has at least the following members:
struct protoent {
char *p_name; /* protocol name */
char **p_aliases; /* pointer to alternate protocol name array */
int p_proto;/* protocol number */
.
};
Services are represented by the port number portion of the address. Each service is
offered on a unique, well-known port number. We can map a service name to a port
number with getservbyname, map a port number to a service name with
getservbyport, or scan the services database sequentially with getservent.
#include <netdb.h>
struct servent *getservbyname(const char *name, const char *proto);
struct servent *getservbyport(int port, const char *proto);
struct servent *getservent(void);
All return: pointer if OK, NULL on error
void setservent(int stayopen);
void endservent(void);
The servent structure is defined to have at least the following members:
struct servent {
char *s_name;
char **s_aliases;
int s_port;
char *s_proto;
.
};
11. Sockets
An Overview
●
System calls related to
- TCP
●
- UDP
#include <sys/socket.h>
int bind(int sockfd, const struct sockaddr *addr, socklen_t len);
Returns: 0 if OK, −1 on error
• sockfd - the socket file descriptor returned by socket().
• addr - a pointer to a struct sockaddr that
contains information about IP address and port
number.
• len - set to sizeof (struct sockaddr)
int connect (int sockfd, struct sockaddr *serv_addr, int addrlen);
• sockfd - the socket file descriptor returned by socket().
• serv_addr - is a struct sockaddr containing the destination port and IP address.
• addrlen - set to sizeof (struct sockaddr).
int listen (int sockfd,int backlog);
• sockfd - the socket file descriptor returned by socket().
• backlog - the number of connections allowed on the incoming queue.
• Backlog should never be zero as servers always expect connection from client.
• The listen function converts an unconnected socket into a passive socket.
• On successful execution of listen is indicating that the kernel should accept
incoming connection requests directed to this socket.
int accept (int sockfd, void *addr, int *addrlen);
sockfd - the socket file descriptor returned by socket().
addr - a pointer to a struct sockaddr_in. The information about the
incoming connection like IP address and port number are
stored.
addrlen - a local integer variable that should be set to sizeof (struct
sockaddr_in) before its address is passed to accept().
close (sockfd);
• Close system call prevents any more reads and writes to the socket. For attempting to
read or write the socket on the remote end will receive an error.
int shutdown (int sockfd, int how);
sockfd - socket file descriptor of the socket to be shutdown.
how – if it is
0 - Further receives are disallowed
1 - Further sends are disallowed
2 - Further sends and receives are disallowed.
The shutdown system call gives more control (than close (sockfd) over how the
socket descriptor can be closed.
Typical server code
struct sockaddr_in serv, cli;
sd = socket (AF_INET, SOCK_STREAM, 0);
serv.sin_family = AF_INET;
serv.sin_addr.s_addr = INADDR_ANY;
serv.sin_port = htons (portno);
bind (sd, &serv, sizeof (serv));
listen (sd, 5);
nsd = accept (sd, &cli, &sizeof (cli));
read / write (nsd, ....);
Typical Client code
struct sockaddr_in serv;
sd =socket(AF_INET,SOCK_STREM, 0);
serv.sin_family = AF_INET;
serv.sin_addr.s_addr = inet_addr(“ser ip”);
serv.sin_port = htons (portno);
connect (sd, &server, sizeof (server));
read / write (sd, ....);
Iterative Server
One client request at a time.
nsd = accept (sd, &cli,...);
while (1) {
read/write(nsd, ...);
}
Concurrent Server
Many clients requests can be serviced concurrently
while (1) {
nsd =(accept (sd, &cli, ....);
if (!fork( )) {
close(sd);
read/write(nsd, .....);
exit();
} else
close(nsd);
}
/* This is a program which illustrates the concurrent server by creating a child
process */
#include<stdio.h>
#include<sys/types.h>
#include<sys/socket.h>
#include<netinet/in.h>
#include<arpa/inet.h>
#define MYPORT 1034
main()
{
int pid,sd,nsd,dat,yes=1;
char message[40];
socklen_t length;
struct sockaddr_in server,client;
if((sd=socket(PF_INET,SOCK_STREAM,0))==-1) {
perror("socket");
exit(1);
}
server.sin_port=htons(MYPORT);
server.sin_family=PF_INET;
server.sin_addr.s_addr=inet_addr("192.168.2.20");
if(bind(sd,(struct sockaddr *)&server,sizeof(server))==-1) {
perror("bind");
exit(1);
}
if(listen(sd,1)==-1) {
perror("listen");
exit(1);
}
/*A child process is created for accepting connections*/
printf("Waiting for connection.............\n");
pid=fork();
while(1)
{
if(pid==0)
{
if((nsd=accept(sd,(struct sockaddr *)&client,&length))==-1) {
perror("accept");
exit(1);
}
printf("Got connection from client:
%s\n",inet_ntoa(client.sin_addr));
/*else fragment is the parent process taking care
of send and receive to clients*/
if((dat=recv(nsd,message,40,0))==-1) {
perror("recv");
exit(1);
}
message[dat]='\0';
}
}
//close(sd);
}
printf("Data received is : %s\n",message);
printf("Enter the data you want to send to client\n");
fgets(message,40,stdin);
send(nsd,message,40,0);
11. Sockets
An Overview
●
System calls related to
●
- TCP
- UDP
Client and Server both has to use
include <sys/socket.h>
ssize_t sendto(int sockfd, const void *buf, size_t nbytes, int flags, const struct
sockaddr *destaddr, socklen_t destlen);
Returns: number of bytes sent if OK, −1 on error
#include <sys/socket.h>
ssize_t recvfrom(int sockfd, void *restrict buf, size_t len, int flags, struct
sockaddr *restrict addr, socklen_t *restrict addrlen);
Returns: length of message in bytes, 0 if no messages are available and peer has done
an orderly shutdown, or −1 on error
Day 5 ­ Morning
12. Network Programming
●TCP Server Client Programming
UDP Server Client Programming
Netlink socket interface
●
●
Tcpclient
#include<sys/socket.h>
#include<sys/types.h>
#include<netinet/in.h>
#include<arpa/inet.h>
#include<stdio.h>
#define PORT 1034
struct sockaddr_in server;
main()
{
int n,sd,length;
char msg[40];
//length=sizeof(client);
if((sd=socket(PF_INET,SOCK_STREAM,0))==-1) {
perror("socket");
exit(1);
}
//
server.sin_family=PF_INET;
server.sin_port=htons(PORT);
server.sin_addr.s_addr=inet_addr("192.168.1.2");
server.sin_addr.s_addr=inet_addr("127.0.0.1");
if(connect(sd,(struct sockaddr *)&server,sizeof(server))==-1) {
perror("connect");
exit(1);
}
printf("Enter the message you want to send to server\n");
fgets(msg,40,stdin);
send(sd,msg,40,0);
printf("Waiting for message from server..............\n");
n=recv(sd,msg,40,0);
msg[n]='\0';
printf("Message received from server is:%s\n",msg);
close(sd);
}
Tcpserver
#include<stdio.h>
#include<sys/types.h>
#include<sys/socket.h>
#include<netinet/in.h>
#include<arpa/inet.h>
#define MYPORT 1034
main()
{
int sd,pid,nsd,dat,yes=1;
char message[40];
struct sockaddr_in server,client;
socklen_t length;
if((sd=socket(PF_INET,SOCK_STREAM,0))==-1) {
perror("socket");
exit(1);
}
server.sin_port=htons(MYPORT);
server.sin_family=PF_INET;
//
server.sin_addr.s_addr=inet_addr("192.168.1.2");
server.sin_addr.s_addr=inet_addr("127.0.0.1");
/*
if(setsockopt(sd,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(int))==-1) {
perror("setsockopt");
exit(1);
}*/
if(bind(sd,(struct sockaddr *)&server,sizeof(server))==-1) {
perror("bind");
exit(1);
}
if(listen(sd,5)==-1) {
perror("listen");
exit(1);
}
printf("Waiting for connection.............\n");
if((nsd=accept(sd,(struct sockaddr *)&client,&length))==-1) {
perror("accept");
exit(1);
}
printf("Got connection from client:%s\n",inet_ntoa(client.sin_addr));
if((dat=recv(nsd,message,40,0))==-1) {
perror("recv");
exit(1);
}
message[dat]='\0';
printf("Data received is : %s\n",message);
printf("Enter the data you want to send to client\n");
fgets(message,40,stdin);
send(nsd,message,40,0);
close(sd);
}
12. Network Programming
TCP Server Client Programming
●
UDP Server Client Programming
●
Netlink socket interface
●
udpclient
#include<sys/socket.h>
#include<sys/types.h>
#include<netinet/in.h>
#include<arpa/inet.h>
#include<stdio.h>
#include <stdlib.h>
#define PORT 1034
main()
{
int n,sd,length;
char msg[40];
struct sockaddr_in server,client;
if((sd=socket(PF_INET,SOCK_DGRAM,0))==-1) {
perror("socket");
exit(1);
}
//
client.sin_family=PF_INET;
client.sin_port=htons(PORT);
client.sin_addr.s_addr=inet_addr("192.168.1.2");
client.sin_addr.s_addr=inet_addr("127.0.0.1");
printf("Enter the message you want to send to server\n");
fgets(msg,40,stdin);
if(sendto(sd,msg,40,0,(struct sockaddr *)&client,sizeof(server))==-1) {
perror("sendto");
exit(1);
}
printf("Waiting for message from server..............\n");
length=sizeof(client);
n=recvfrom(sd,msg,40,0,(struct sockaddr *)&server,&length);
msg[n]='\0';
printf("Message received from server is:%s\n",msg);
}
udpserver
#include<stdio.h>
#include<sys/types.h>
#include<sys/socket.h>
#include<netinet/in.h>
#include<arpa/inet.h>
#include<stdlib.h>
#define MYPORT 1034
main()
{
int sd,nsd,dat,length,yes=1;
char message[40];
struct sockaddr_in server,client;
//
/*
if((sd=socket(PF_INET,SOCK_DGRAM,0))==-1) {
perror("socket");
exit(1);
}
server.sin_port=htons(MYPORT);
server.sin_family=PF_INET;
server.sin_addr.s_addr=inet_addr("192.168.1.2");
server.sin_addr.s_addr=inet_addr("127.0.0.1");
if(setsockopt(sd,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(int))==-1) {
perror("setsockopt");
exit(1);
}*/
if(bind(sd,(struct sockaddr *)&server,sizeof(server))==-1) {
perror("bind");
exit(1);
}
length=sizeof(client);
if((dat=recvfrom(sd,message,40,0,(struct sockaddr
*)&client,&length))==-1) {
perror("recvfrom");
exit(1);
}
printf("Got connection from client:%s\n",inet_ntoa(client.sin_addr));
message[dat]='\0';
}
printf("Data received is : %s\n",message);
printf("Enter the data you want to send to client\n");
fgets(message,40,stdin);
sendto(sd,message,40,0,(struct sockaddr *)&client,length);
12. Network Programming
TCP Server Client Programming
●
UDP Server Client Programming
●
Netlink socket interface
●
netlink - Communication between kernel and userspace (PF_NETLINK)
#include <asm/types.h>
#include <sys/socket.h>
#include <linux/netlink.h>
netlink_socket = socket(PF_NETLINK, socket_type, netlink_family);
Netlink is used to transfer information between kernel and userspace processes.
It consists of a standard sockets-based interface for userspace processes and an internal
kernel API for kernel modules.
Netlink is a datagram-oriented service. Both SOCK_RAW and
SOCK_DGRAM are valid values for socket_type. However, the netlink protocol
does not distinguish between datagram and raw sockets.
netlink_family selects the kernel module or netlink group to communicate
with. The currently assigned netlink families are:
NETLINK_ROUTE
Receives routing and link updates and may be used to modify the routing
tables (both IPv4 and IPv6), IP addresses, link parameters, neighbour setups,
queueing disciplines, traffic classes and packet classifiers
NETLINK_W1
Messages from 1-wire subsystem.
Example creates a NETLINK_ROUTE netlink socket which will listen to the
RTM-GRP_LINK (network interface create/delete/up/down events) and
RTMGRP_IPV4_IFADDR (IPv4 addresses add/delete events) multicast groups.
struct sockaddr_nl sa;
memset (&sa, 0, sizeof(sa));
snl.nl_family = AF_NETLINK;
snl.nl_groups = RTMGRP_LINK | RTMGRP_IPV4_IFADDR;
fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
bind(fd, (struct sockaddr*)&sa, sizeof(sa));
Example demonstrates how to send a netlink message to the kernel (pid 0). Note
that application must take care of message sequence numbers in order to reliably
track acknowledgements.
struct
struct
struct
struct
nlmsghdr *nh;
/* The nlmsghdr with payload to send. */
sockaddr_nl sa;
iovec iov = { (void *) nh, nh->nlmsg_len };
msghdr msg;
msg = { (void *)&sa, sizeof(sa), &iov, 1, NULL, 0, 0 };
memset (&sa, 0, sizeof(sa));
sa.nl_family = AF_NETLINK;
nh->nlmsg_pid = 0;
nh->nlmsg_seq = ++sequence_number;
/* Request an ack from kernel by setting NLM_F_ACK. */
nh->nlmsg_flags |= NLM_F_ACK;
sendmsg (fd, &msg, 0);
And the last example is about reading netlink message.
int len;
char buf[4096];
struct iovec iov = { buf, sizeof(buf) };
struct sockaddr_nl sa;
struct msghdr msg;
struct nlmsghdr *nh;
msg = { (void *)&sa, sizeof(sa), &iov, 1, NULL, 0, 0 };
len = recvmsg (fd, &msg, 0);
for (nh = (struct nlmsghdr *) buf; NLMSG_OK (nh, len);
nh = NLMSG_NEXT (nh, len)) {
/* The end of multipart message. */
if (nh->nlmsg_type == NLMSG_DONE)
return;
if (nh->nlmsg_type == NLMSG_ERROR)
/* Do some error handling. */
...
/* Continue with parsing payload. */
...
}
Day 5 ­ Morning
13. Programming and Debugging Tools
●strace – Tracing system calls
ltrace – Tracing library calls
Tools used to detect memory access error ; and memory leakage in linux mtrace
Using gdb and ddd utilities
●Core dump analysis etc...
●
●
●
Tracing Processes
• strace command
– trace system calls and signals
– strace runs until the given command exits
– It is a useful tool for diagnostic, instructional and
debugging
• ptrace system call
– Process trace
Strace
#strace -c -e trace=file mkfifo -m 0744 myfifo
execve("/usr/bin/mkfifo", ["mkfifo", "-m", "0744", "myfifo"]) = 0
% time seconds us/call calls syscall
------ ----------- ----------- --------- --------- ---------------47.62 0.000020
20
1
mknod
33.33 0.000014
4
4
open
11.90 0.000005
5
1
chmod
7.14 0.000003
1
3
fstat
------ ----------- ----------- --------- --------- ---------------100.00 0.000042
9
1. Trace the Execution of an Executable
$ strace ls
execve("/bin/ls", ["ls"], [/* 21 vars */]) = 0
brk(0)
access("/etc/ld.so.nohwcap", F_OK)
= -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb78c7000
access("/etc/ld.so.preload", R_OK)
= -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)
= 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=65354, ...}) = 0
...
...
2. Trace a Specific System Calls in an Executable Using Option -e
$ strace -e open ls
open("/etc/ld.so.cache", O_RDONLY)
open("/lib/libselinux.so.1", O_RDONLY)
open("/lib/librt.so.1", O_RDONLY)
= 3
= 3
= 3
open("/lib/libacl.so.1", O_RDONLY)
= 3
open("/lib/libc.so.6", O_RDONLY)
= 3
open("/lib/libdl.so.2", O_RDONLY)
= 3
open("/lib/libpthread.so.0", O_RDONLY) = 3
open("/lib/libattr.so.1", O_RDONLY)
= 3
open("/proc/filesystems", O_RDONLY|O_LARGEFILE) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
open(".", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3
3. Execute Strace on a Running Linux Process Using Option -p
$ strace -p 1725 -o output.txt
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
Could not attach to process. If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf
4. Print Relative Time for System Calls Using Option -r
Strace also has the option to print the execution time for each system calls as shown
below.
$ strace -r ls
0.000000 execve("/bin/ls", ["ls"], [/* 37 vars */]) = 0
0.000846 brk(0)
= 0x8418000
0.000143 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
0.000163 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb787b000
0.000119 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
0.000123 open("/etc/ld.so.cache", O_RDONLY) = 3
0.000099 fstat64(3, {st_mode=S_IFREG|0644, st_size=67188, ...}) = 0
0.000155 mmap2(NULL, 67188, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb786a000
...
Day 5 ­ Morning
13. Programming and Debugging Tools
strace – Tracing system calls
●
ltrace – Tracing library calls
●
Tools used to detect memory access error ; and memory leakage in linux mtrace
Using gdb and ddd utilities
●Core dump analysis etc...
●
●
ltrace' is another Linux Utility similar to 'strace'. However, ltrace lists all the library calls being
called in an executable or a running process.
This tool is very useful for debugging user-space applications to determine which library call
is failing.
It is also capable of receiving signals for segmentation faults, etc.
Assume the code
1. #include <stdio.h>
2. #include <unistd.h>
3..
4. int main()
5. {
6.
FILE *fp = fopen("rfile.txt", "w+");
7.
fprintf(fp+1, "Invalid Write\n");
8.
fclose(fp);
9.
return 0;
10. }
Lets compile and run it.
Code:
[email protected]:~/source$ gcc file.c -Wall -o file
[email protected]:~/source$./file
Segmentation fault (core dumped)
That is a segmentation fault. Lets use ltrace to debug and see what is happening.
Code:
[email protected]:~/source$ltrace ./file
__libc_start_main(0x8048454, 1, 0xbfc19db4, 0x80484c0, 0x8048530 <unfinished ...>
fopen("rfile.txt", "w+")
= 0x9160008
fwrite("Invalid Write\n", 1, 14, 0x916009c <unfinished ...>
--- SIGSEGV (Segmentation fault) --+++ killed by SIGSEGV +++
Day 5 ­ Morning
13. Programming and Debugging Tools
strace – Tracing system calls
ltrace – Tracing library calls
●
●
Tools used to detect memory access
error ; and memory leakage in linux
mtrace
●
Using gdb and ddd utilities
Core dump analysis etc...
●
●
Mtrace, memory trace. Follow the steps to use it
1. Call mtrace() When Your Program Starts
#include <stdio.h>
#include <stdlib.h>
#include <mcheck.h>
int main() {
char *string;
mtrace();
string = malloc(100 * sizeof(char));
return 0;
}
2. Compile Program with Debugging Options
$gcc -g -o mtrace_test mtrace_test.c
3.. Set MALLOC_TRACE
For bash
export MALLOC_TRACE="mtrace.out"
For C shell, it would be:
setenv MALLOC_TRACE mtrace.out
4. Run The Program Once
5.View The Data
mtrace <prog name> <output log file name>
mtrace mtrace_test mtrace.out
Assuming the C code at the beginning was the code in mtrace_test.c, the following output
would be produced:
Memory not freed:
----------------Address Size
0x0000000000501460
Caller
0x64 at /array/home/dcurrie/test/mtrace/mtrace_test.c:11
Valgrind
Finding Memory Leaks With Valgrind
eample.c
include <stdlib.h>
int main()
{
char *x = malloc(100); /* or, in C++, "char *x = new char[100] */
x[10] = 'a';
return 0;
}
$gcc example.c -o example
$valgrind --tool=memcheck --leak-check=yes example
==2116== 100 bytes in 1 blocks are definitely lost in loss record 1 of 1
==2116==
at 0x1B900DD0: malloc (vg_replace_malloc.c:131)
==2116==
by 0x804840F: main (in /home/cprogram/example1)
Finding Invalid Pointer Use With Valgrind
valgrind --tool=memcheck --leak-check=yes example
results in the following warning
==9814==
==9814==
==9814==
==9814==
==9814==
Invalid write of size 1
at 0x804841E: main (example2.c:6)
Address 0x1BA3607A is 0 bytes after a block of size 10 alloc'd
at 0x1B900DD0: malloc (vg_replace_malloc.c:131)
by 0x804840F: main (example2.c:5)
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement