Ha

Published on January 2017 | Categories: Documents | Downloads: 146 | Comments: 0 | Views: 3146

of 43

Content

Java 828242
http://www.ccs.neu.edu/home/matthias/HtDP2e/index.html

size of a file du –h /a.txt
hadoop123
install pdsh
cd pdsh
./configure
make
make install
pdsh -R exec -w 192.168.1.3[0-2] ssh -x -l %u %h yum -y install krb5-workstation.x86_64
pdsh -R exec -w 192.168.1.2[1-7] ssh -x -l %u %h date -s \"16 APR 2012 19:04:09\"

:%s/foobar/hadoop

Replace words in vi
Linux replace in editor
:%s/\/dfsdata//g
2012-02-21 01:36:55,819 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
Datanode state: LV = -19 CTime = 1328715960120 is newer than the
namespace state: LV = -19 CTime = 0

No of mappers depends on input file

Configuration
Imp to know current features how the new api are coming
Job
Inputformats
Mapperclass
Reducerclass
Outputformat
Key and value
Java reflection
Amount reads u do amount of write u do
Network bandwidth
No of songs per artist
Write ur own object
Lzp rpms http://pkgs.repoforge.org/lzo/
Networkinglinux
http://www.linuxhomenetworking.com/wiki/index.php/Main_Page
crontab
*/10 * * * * netstat -plten 2>&1 >> /root/netstat.log
*/10 * 3 * * netstat -plten 2>&1 >> | mail -s "cronjob output"
[email protected]

# umount /media/disk/
umount: /media/disk: device is busy
umount: /media/disk: device is busy

First thing you’ll do will probably be to close down all your terminals and xterms but
here’s a better way. You can use the fuser command to find out which process was
keeping the device busy:
# fuser -m /dev/sdc1
/dev/sdc1: 538
# ps auxw|grep 538
donncha 538 0.4 2.7 219212 56792 ? SLl Feb11 11:25 rhythmbox

Mount problem
Mount –o remount,rw /
Edit vi /etc/fstab

/dev/sda1

/storage/data1

reboot
clusteradmin

ALL=(ALL)

NOPASSWD: ALL

hwclock --set --date="5/1/10 15:48:07"

# date -s "2 OCT 2006 18:00:00"
date --set="2 OCT 2006 18:00:00"
date +%Y%m%d -s "20081128"
date +%T -s "10:13:13"

Linux commands
http://support.nagios.com/knowledgebase/faqs/index.php?
option=com_content&view=article&id=52&catid=35&faq_id=305&expand=f
alse&showdesc=true
http://yahoo.github.com/hadoop-common/installing.html
export PATH=$PATH:/usr/bin/:/usr/bin/
export JAVA_HOME=/usr/java/jdk1.7.0/

export PATH=$PATH:$JAVA_HOME
rpm –i –-force jdk.1.6….
if java –version shows 1.4
then
rm /usr/bin/java
ln –s /usr/java/jdk..1.6 /bin/java /usr/bin/java

fdisk -l | grep Disk
/etc/redhat-release
Uname
Uname -a
vi .bash_profile
export HADOOP_HOME=/home/hadoop/hadoop-0…./
export PATH=$PATH:$HADOOP_HOME/bin

for command not found
Pxe
/tftpboot/pxelinux.cfg/default
Set replication 2
bin/hadoop fs -setrep -R -w 2 /tmp/hadoophadoop/mapred/staging/hadoop/.staging
default Centos

LABEL Centos
MENU LABEL Centos
KERNEL images/centos/x86_64/5.6/vmlinuz
append vga=normal initrd=images/centos/x86_64/5.6/initrd.img
ramdisk_size=32768
ksdevice=eth0 ks=ftp://192.168.1.45/install/ks/ks.cfg
Avinash.ldif
# avinash, < style="font-weight:bold;">localdomain.com
#dn: uid=root,ou=People,dc=localdomain,dc=com
#uid: root
#cn: admin
#objectClass: account
#objectClass: posixAccount
#objectClass: top
#objectClass: shadowAccount
#userPassword: {SSHA}PCHPZji+1m+sX0HwudP+UEqL9RZ4CXNR
#shadowLastChange: 15221
#shadowMin: 0
#shadowMax: 99999
#shadowWarning: 7
#loginShell: /bin/bash
#uidNumber: 0
#gidNumber: 0
#homeDirectory: /root
#gecos: root
dn: uid=arun,ou=People,dc=localdomain,dc=com

cn: arun kumar
sn: kumar
objectClass: top
objectClass: person
objectClass: posixAccount
objectClass: shadowAccount
userPassword: {SSHA}PCHPZji+1m+sX0HwudP+UEqL9RZ4CXNR
uid: arun
uidNumber: 502
gidNumber: 501
loginShell: /bin/bash
homeDirectory: /home/arun
shadowLastChange: 10877
shadowMin: 0
shadowMax: 999999
shadowInactive: -1

Ldap Authentication
<Directory "/var/www/html">
AuthType Basic
AuthName "enter your login id"
AuthBasicProvider ldap

AuthzLDAPAuthoritative of
AuthLDAPURL ldap://192.168.1.45:389/dc=localdomain,dc=com?uid?sub
require valid-user
Options None
Ganglia
Download EPEL(extra packages for enterprise linux)
user@host ~]$ sudo rpm -Uvh
http://download.fedora.redhat.com/pub/epel/5/x86_64/epel-release-54.noarch.rpm

sudo yum install rrdtool ganglia ganglia-gmetad ganglia-gmond
ganglia-web
sudo /sbin/chkconfig --levels 235 gmond on
sudo /sbin/service gmond start
sudo vim /etc/gmetad.conf
sudo /sbin/chkconfig --levels 235 gmetad on
sudo /sbin/service gmetad start
yum install httpd
go to gmond.conf and add host = ipaddr at udp_sthg
and cluster{
“grreen”

puppet
http://www.linuxforu.com/how-to/puppet-show-automating-unixadministration/

http://library.linode.com/application-stacks/ puppet/installation#sph_configuring-puppet
Download EPEL if ur linux didn’t have in ur linux.
rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/x86_64/epelrelease-5-4.noarch.rpm

yum install puppet-server --enablerepo=epel
yum install ruby-rdoc
vi /etc/puppet/manifests/site.pp
# /etc/puppet/manifests/site.pp
import "classes/*"
node default {
include sudo
}
vi /etc/puppet/manifests/classes/sudo.pp
1 # /etc/puppet/manifests/classes/sudo.pp
2
3 class sudo {
4
file { "/etc/sudoers":
5
owner => "root",
6
group => "root",
7
mode => 440,
8
}
9 }
service puppet-server start
# chkconfig puppet-server on

client
yum install puppet –enablerepo=epel
yum install ruby-rdoc
vi /etc/sysconfig/puppet

# The puppetmaster server
PUPPET_SERVER=PuppetMaster
# If you wish to specify the port to connect to do so here
#PUPPET_PORT=8140
# Where to log to. Specify syslog to send log messages to the
system log.
PUPPET_LOG=/var/log/puppet/puppet.log
# You may specify other parameters to the puppet client here
#PUPPET_EXTRA_OPTS=--waitforcert=500
# service puppet start
# chkconfig puppet on
Ping –c 3 puppet
Change hostname and domainname puppetd –-puppet server and clients for
fully qualified domainname
Puppetd --server puppet.example.com –-waitforcert 60 –-test

Msg will appear
Info:cereating …………….
In puppet-server machine
Puppetca –-list
Puppetca –-sign puppetclient.examples.com
Check whether your pupputmaster and puppet client is on or of
Stop your ip tables
Check your hostname and domain name

Make certificates transfer b/w puppet and puppetmaster
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multinode-cluster/#networking

file { "/avi":
source => "/etc/httpd/conf/httpd.conf",
recurse => "true"
Restart puppetmaster
Puppetd –server puppet.example.com –waitforcert 60 –test
If any request errors /var/lib/puppet/ssl/certs or certificates_requests we
haqqve to delete in certs folder in client and server
http://ankitasblogger.blogspot.com/2011/01/hadoop-cluster-setup.html
Hadoop cluster setup
http://www.mazsoft.com/blog/post/2009/11/19/setting-up-hadoophive-cluster-on-Centos5.aspx

Install hadoop tar from cloudera tarball or rpm but I recommend through
tarball
Install java through rpm
Copy hadoop to /usr/local
Cp –r hadoop.0.20.2.cdh3.u2…. /usr/local
Export java path
export JAVA_HOME=/usr/java/jdk
change in core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.127:8020</value>
</property>
change in hdfs.site.xml
/storage/name(sda) /storage1/name(disk)sdb

Should do soft mount not hard mount
Rsync –r /storage/name /storage2/
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/storage/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/storage/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
Mapred-site-xml
<property>
<name>mapred.job.tracker</name>
<value>192.168.1.127:8021</value>
</property>
Start passwordless connection b/w namenode and slaves
Ssh-keygen –t rsa
Mkdir b@ip –p .ssh
cat .ssh/id_rsa.pub | ssh b@B 'cat >> .ssh/authorized_keys'

ssh b@ip
or
ssh-copy-id –i /home/hadoop/.ssh/id_rsa.pub [email protected]
ssh centos1
ssh centos2
...
ssh b/w jobtracker and datanodes
namenode conf files
nano conf/masters
secondary namenode
centos1
nano conf/slaves
datanodes
secondary namenode
vi masters
secondary namenode ip addr
slaves
datanodes ip addr
change the permissions of tarball hadoop with hadoop:hadoop for all the
servers

ssh-keygen -t dsa
ssh-copy-id -i /home/hadoop/.ssh/id_dsa hadoop@localhost
for each server
bin/dfs.start.all in namenode
bin/mapred.start.all in jobtracker

Namenode

SN

JT

DN

DN TT

DN TT

TT

MASTER SNIP

MASTER SN IP

SDN
lavesTT

SLAVES
4DNIP

SLAVES DNIP

No slaves

I have started it but getting errors like
bash: /usr/local/hadoop/hadoop-0.20.2-cdh3u2/bin/hadoop-daemon.sh: No
such file or directory
so I started single for one of the datanode
09/10/08 13:30:12 INFO ipc.Client: Retrying connect to server: /
192.168.1.127:8021. Already tried 0 time(s).

I changed the namespace id as equal to namenode
/storage/name/current/VERSION
SO I RAN WORD COUNT PRG SO I GOT IT
Ganglia with puppet

/etc/puppet/manifests/Site.pp
Node hostname{

Include ganglia
include ganglia::copy_conf
include ganglia::copy_services
}

#Ganglia configuration file


#Ganglia service

class ganglia{
package { 'rpm wget ftp://192.168.1.102/ganglia-gmond-3.0.71.el5.x86_64.rpm':
ensure => installed
}
}
worked with another
class ganglia{

exec{"gmond":
command => "/usr/bin/wget ftp://192.168.1.146/ganglia-gmond3.0.7-1.el5.x86_64.rpm",
cwd => "/root",
creates => "/root/ganglia-gmond-3.0.7-1.el5.x86_64.rpm",

}
}

class ganglia{

package { 'ganglia':
ensure => installed
}

http://www.unixmen.com/linux-tutorials/1591-install-puppet-master-andclient-in-ubuntu

/sbin/service {
'ganglia':
ensure => true,
enable => true,
require => Package['ganglia']
}
package { 'yum':
ensure => installed,
}
}
http://tech.mangot.com/
class ganglia{

package { "ganglia":
ensure => installed
}
package { "ganglia-gmond":
ensure => installed
}
service { "gmond":
ensure => running,
subscribe => File["/etc/init.d/gmond"],
enable => true,
require => File["gmond"]
}
}
Another .pp
enable

=> "true",
name
=> "pakiti",
start => "/etc/init.d/pakiti start",
status => "/etc/init.d/pakiti status",
stop
=> "/etc/init.d/pakiti stop",
ensure => "running",
hasstatus => "true",
require => Package["pakiti-client"],
}

worked with this code
class ganglia{

package { "ganglia":

ensure => installed
}
package { "ganglia-gmond":
ensure => installed
}
service { "gmond":
enable

=> "true",

#start

=> "/etc/init.d/gmond start",

ensure

=> "running",

#require => File['/etc/init.d/gmond']
}
}
Installation ganglia with puppet completed in centos
/etc/puppet/modules/ganglia/manifests/init.pp
class ganglia{

package { "ganglia":
ensure => installed
}
package { "ganglia-gmond":
ensure => installed
}
include ganglia::copy_conf
include ganglia::copy_services
}

service { "gmond":
enable

=> "true",

#start

=> "/etc/init.d/gmond start",

ensure

=> "running",

#require => File['/etc/init.d/gmond']
}

class ganglia::copy_services{
file { 'gmond':
path => '/etc/init.d/gmond',
content =>
template('/etc/puppet/modules/ganglia/templates/services/gmond.erb'),
ensure => file,
owner => "root",
group => "root",
mode => 777,
}
}
class ganglia::copy_conf{
file { 'gmond.conf':
path => '/etc/gmond.conf',
ensure => file,
content =>
template('/etc/puppet/modules/ganglia/templates/conf/gmond.conf.erb'),
owner => "root",
group => "root",

mode => 777,
}
}

Errors remove the requests and certs when you get error
/var/lib/puppet/ssl/certs
/var/lib/puppet/ssl/certificates-requests
Or else reinstall and run puppet
f

Move the /etc/init.d/gmond to /…services/gmond.erb
Move /etc/conf/gmond.conf to /..conf/gmond.conf.erb

http://groups.google.com/group/puppetusers/browse_thread/thread/1b4f4edf1d328b4d?pli=1
Hadoop with puppet
http://itand.me/using-puppet-to-manage-users-passwords-and-ss
http://duxklr.blogspot.com/2011/05/using-puppet-to-manage-users-groupsand.html
define add_user($uid){

$username = $avinash,

user {$avinash:
home => "/home/$avinash",
shell => "/bin/bash",
uid => $503,
ensure => created,
}
group{$avinash:
gid => $504,
require => user[$avinash]
ensure => created;
}
file{"/home/$avinash/":
ensure => directory,
owner => $avinash,
group => $avinash,
mode => 750,

require => [user[$avinash],group[$avinash]]
}
Finished but I cant create user
user { "avinash":
groups => 'avinash',
commend => 'This user was created by Puppet',
ensure => 'present',
managed_home => 'true',
}
file { "/home/avinash/":
ensure => 'directory',
require => User['avinash'],
owner => 'avinash',
mode => '700',
}
http://itand.me/using-puppet-to-manage-users-passwords-and-ss
worked with this code but didn’t work
class /usr/sbin/useradd {

$username = $avinash,

user { $avinash:
home => "/home/$avinash",
shell => "/bin/bash",
uid => $503,

ensure => created,
}
group {$avinash:
gid => $504,
require => user[$avinash]
ensure => created;
}
file{"/home/$avinash/":
ensure => directory,
owner => $avinash,
group => $avinash,
mode => 750,
require => [user[$avinash],group[$avinash]]
}
user { "avinash":
groups => 'avinash',
commend => 'This user was created by Puppet',
ensure => 'present',
managed_home => 'true',
}
file { "/home/avinash/":
ensure => 'directory',
require => User['avinash'],
owner => 'avinash',

mode => '700',
}
Failed
# /etc/puppet/modules/users/virtual.pp

class /usr/sbin/useradd::virtual {
@user { "avinash":
home => "/home/avinash",
ensure => "present",
groups => ["root","avinash"],
uid => "504",
password => "centos",
comment => "User",
shell => "/bin/bash",
managehome => "true",
}
http://marksallee.wordpress.com/2010/08/25/create-a-puppet-test-networkwith-virtualbox/
puppet with hadoop
class hadoop{
exec{"hadoop-tar":
command => "/usr/bin/wget ftp://192.168.1.127/hadoop-0.20.2cdh3u2.tar.gz",
cwd => "/home/hadoop",
creates => "/home/hadoop/hadoop-0.20.2-cdh3u2.tar.gz",

}
exec {"hadooptar":
command => "/bin/tar -xvvf hadoop-0.20.2-cdh3u2.tar.gz",
cwd => "/home/hadoop",
creates => "/home/hadoop/hadoop-0.20.2-cdh3u2/",
}
# a fuller example, including permissions and ownership
file { "/storage":
ensure => "directory",
owner => "hadoop",
group => "hadoop",
mode => 750,
}
}

future
http://bitfieldconsulting.com/puppet-and-mysql-create-databases-and-users
mysql
http://blog.gurski.org/index.php/2010/01/28/automatic-monitoring-withpuppet-and-nagios/
nagios
pig
tar

records = LOAD '/home/hadoop/sample.txt' AS (year:int,temp:int);
dump results;
dump filter_records;

Loadfunc
Reverse
Name
Avinash
Sharad
o/p
hsanivia

how to read a xml file
<xml>
<ps
</ps>
Xml to text
[email protected]
Lamp architecture
Kerberos:authentication
How hadoop is is better than ratitional technology
Flume
Collect log files,aggregate,transforms.
35871 port of flume master
Sqoop
Install sqoop,hbase from tar
Mysql mysql-server php-mysql
Chkconfig –levels 235 mysql on
start

Setpath hadoop,hbase,java
Copy mysql jdbc driver..jar to sqoop/lib/
Create user account for hadoop
Password for hadoop in mysql
Use database;
Create a table in mysql
Create table tname(id int,name char(20),primary key(id));
Insert into tname values()
GRANT ALL ON mysql.* TO 'hadoop'@'localhost';
sqoop import --connect jdbc:mysql://192.168.1.56/<mysql> --username
root --password centos --table <sqooptable>

bin/sqoop import --connect jdbc:mysql://localhost/<mysql>
--username hadoop
–-password centos --table foo

Then after running it generates trasfered files
See o/p in hdfs /user/hadoop/tname/part…
Create a empty table bar
Grant privilages

bin/sqoop export --connect jdbc:mysql://localhost/mysql --username
hadoop --password centos --table bar --export-dir yeluri.txt
mysql to hive
see in mysql whether the table is updated or not
that is sqoop
mysql to hive
sqoop import --connect jdbc:mysql://localhost/movielens --username root
--password centos --table genre --hive-import --hive-table GENRE --hivehome /usr/lib/hive
sqoop --hive-import --connect jdbc:mysql://localhost/movielens --username
root --password centos --table genre --hive-table GENRE /usr/lib/hive/bin/
PIG
Export pig path home
Java path home

bin/pig -x local
REGISTER /home/hadoop/Pigex.jar
b = foreach a generate number,age,year(c);
a = load '/home/hadoop/y.txt' as (number,age,year);

drbd http://www.cloudera.com/blog/2009/07/hadoop-ha-configuration/

touch /home/hadoop/excludes

<property>
<name>dfs.hosts.exclude</name>
<value>/home/hadoop/excludes</value>
<final>true</final>
</property>
<property>
<name>mapred.hosts.exclude</name>
<value>/home/hadoop/excludes</value>
<final>true</final>
</property>

Secondary namenode as namenode

Deleted name folder changed the ip addrs in hdfs,coresite changed
/storage/name as /storage/namesecondary
So I got errors as failed to initialize so I overcome with this
hadoop-daemon.sh start namenode –importCheckpoint
next
hadoop-daemon.sh start namenode
HIVE
export HIVE_INSTALL=/home/hadoop/hive-0.7.1-cdh3u2
export PATH=$PATH:$HIVE_INSTALL/bin
export HADOOP_HOME=/home/hadoop/hadoop-0.20.2-cdh3u2
sudo cp mysql-connector-java-5.1.15/mysql-connector-java-5.1.15-bin.jar /usr/lib/hive/lib/

switch to hdfs or user where user has permissions to hdfs
error because of not export java in hadoop-env.sh
install ant
yum search ant
ant.x86_64
export ANT_LIB=/path/to/ant/lib
export ANT_LIB=/usr/shareant/lib
<property>
<name>hive.hwi.listen.host</name>
<value>0.0.0.0</value>
<description>This is the host address the Hive Web Interface
will listen on</description>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9999</value>
<description>This is the port the Hive Web Interface will
listen on</description>
</property>
<property>
<name>hive.hwi.war.file</name>
<value>lib/hive_hwi.war</value>
<description>This is the WAR file with the jsp content for Hive
Web Interface</description>
</property>
hive_hwi.war in /usr/lib/hive/lib/ hive_hwi.war.cdh3.war
chmod 755 t0…..war

#start httpd
#stop iptables
In worstcase
export ANT_LIB=/usr/share/ant/lib
bin/hive --service hwi
https://cwiki.apache.org/confluence/display/Hive/HiveWebInterface

http://localhost:9999/hwi

put some data into hdfs
ab
hive.metastore.warehouse.dir

// for(int val =0;Iterable(values)!=0;values.iterator())
// {
// }

alias eth0 forcedeth
alias eth1 forcedeth
alias scsi_hostadapter sata_nv
alias scsi_hostadapter1 usb-storage
add user in hdfs
add avinash in linux
bin/hadoop fs –mkdir /user/avinash
bin/hadoop fs –chown hadoop:supergroup /user/avinash
core-site-xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.231:8020</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop-0.20/cache/${user.name}</value>
</property>


<property>

<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>

</configuration>
Hdfs-site-xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>

<name>dfs.name.dir</name>
<value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
</property>


<property>
<name>dfs.namenode.plugins</name>
<value>org.apache.hadoop.thriftfs.NamenodePlugin</value>
<description>Comma-separated list of namenode plug-ins to be
activated.
</description>
</property>
<property>
<name>dfs.datanode.plugins</name>
<value>org.apache.hadoop.thriftfs.DatanodePlugin</value>
<description>Comma-separated list of datanode plug-ins to be activated.
</description>
</property>
<property>
<name>dfs.thrift.address</name>
<value>0.0.0.0:10090</value>
</property>
</configuration>
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>

<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>


<property>
<name>mapred.jobtracker.plugins</name>
<value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
<description>Comma-separated list of jobtracker plug-ins to be activated.
</description>
</property>
<property>
<name>jobtracker.thrift.address</name>
<value>0.0.0.0:9290</value>
</property>
</configuration>
Kerberos cdh3
https://ccp.cloudera.com/display/CDHDOC/Configuring+Hadoop+Security+in
+CDH3#ConfiguringHadoopSecurityinCDH3-TryRunningaMap%2FReduceJob
cacti http://www.cacti.net/downloads/docs/html/unix_configure_cacti
http://www.cyberciti.biz/faq/fedora-rhel-install-cacti-monitoring-rrd-software/
hadoop jar jarname classname inputhdfspath outputhdfspath
#!/usr/bin/env python
'''
This script used by hadoop to determine network/rack topology. It
should be specified in hadoop-site.xml via topology.script.file.name
Property.

<property>
<name>topology.script.file.name</name>
<value>/home/hadoop/topology.py</value>
</property>
'''
import sys
from string import join
DEFAULT_RACK = '/default/rack0';
RACK_MAP = { '208.94.2.10' : '/datacenter1/rack0',
'1.2.3.4' : '/datacenter1/rack0',
'1.2.3.5' : '/datacenter1/rack0',
'1.2.3.6' : '/datacenter1/rack0',
'10.2.3.4' : '/datacenter2/rack0',
'10.2.3.4' : '/datacenter2/rack0'
}
if len(sys.argv)==1:
print DEFAULT_RACK
else:
print join([RACK_MAP.get(i, DEFAULT_RACK) for i in sys.argv[1:]]," ")

Saas
Pass
Iaas

Namenode is in safemode

Nfs
Given doc
Server-backup
Client namenode

echo "hadoop install"
#wget http://archive.cloudera.com/redhat/cdh/cdh3-repository-1.01.noarch.rpm
#rpm -ivh cdh3-repository-1.0-1.noarch.rpm
#yum search hadoop
echo "hadoop installing"
#sudo yum -y install hadoop-0.20
#yum -y install hadoop-0.20-namenode
#yum -y install hadoop-0.20-secondarynamenode
#yum -y install hadoop-0.20-jobtracker
#yum -y install hadoop-0.20-datanode
#yum -y install hadoop-0.20-tasktracker
echo " install mysql"
#yum -y install mysql-server mysql httpd
#scp [email protected]:/usr/lib/hadoop-0.20/conf/core-site.xml
/usr/lib/hadoop-0.20/conf/core-site.xml
#scp [email protected]:/usr/lib/hadoop-0.20/conf/hdfs-site.xml
/usr/lib/hadoop-0.20/conf/hdfs-site.xml
#scp [email protected]:/usr/lib/hadoop-0.20/conf/mapred-site.xml
/usr/lib/hadoop-0.20/conf/mapred-site.xml

#wget ftp://192.168.1.32/jdk-6u25-linux-x64-rpm.bin
#chmod 755 jdk-6u25-linux-x64-rpm.bin
#./jdk-6u25-linux-x64-rpm.bin
export JAVA_HOME=/usr/java/jdk1.6.0_25
export PATH=$PATH:$JAVA_HOME
hadoop log retention
IRC, "mapred.userlog.retain.hours" (24h default) controls this in my
environment and it seems to work fine on my cluster. Are you sure you
have tasklogs older than 24h lying around? It might even be a bug that
may have been fixed in the subsequent 0.20 releases that went out
recently.
Thanks for the reply. I realized that the property you mentioned
was missing in my mapred-site.xml.
I added the entry and it works just fine.
Was my assumption that "*hadoop.tasklog.logsRetainHours " *in
log4j.properties will do the same wrong? What is this property for in that
case?

export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote
$HADOOP_NAMENODE_OPTS"
HADOOP_NAMENODE_OPTS="-Xmx500m" will set it to 500MB. The "OPTS" here
> refers to JVM options. -Xmx is a common JVM option to set the maximum
> heap.

Set dfs.replication=2;
Increase the heap size of tasktracker jvm
mapred.child.java.opts property.
The default setting is -Xmx200m, which gives each task 200 MB of memory.

Datanode summary

http://192.168.1.123:50075/blockScannerReport?listblocks
46122674

Sloved the below problem by stoping firewalls and selinux-disabled

cat temp.txt | awk -F "\t" '{$1=$1}1' OFS="\n"
print horizontal

cat temp.txt | awk -F "\t" '{$1=$1}1' OFS="\n" > avi.txt
paste ban.txt avi.txt
paste ban.txt avi.txt | awk -F " " '{print $1" -----"$3}'

ntp
server 0.us.pool.ntp.org
server 1.us.pool.ntp.org
server 2.us.pool.ntp.org
server 3.us.pool.ntp.org
service ntpd start
chkconfig ntpd on
iptables -I INPUT -p udp --dport 123 -j ACCEPT
iptables –L
ntpq -p

Ha

Comments

Content

Sponsor Documents

Recommended