Put below two lines into ~/.bash_profile
export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8
In terminal, run the below command (or restart the user session)
source ~/.bash_profile
Put below two lines into ~/.bash_profile
export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8
In terminal, run the below command (or restart the user session)
source ~/.bash_profile
Pre-requisite: Working Apache Server.
In httpd-vhosts.conf:
Add below content inside <VirtualHost …>
ScriptAlias /cgi-bin/ "/cgi-bin/" <directory "<httpd-installed-path="">/cgi-bin/"> Options Indexes FollowSymLinks ExecCGI AddHandler cgi-script .cgi .py AllowOverride None Require all granted
Now create a new file ‘test.py’ in /cgi-bin/ with content:
#!/usr/bin/env python import cgi cgi.test()
Make sure the below line is not commented in httpd.conf
LoadModule cgi_module modules/mod_cgi.so
Start/Restart apache server.
You can verify the loaded cgi module using the command:
sudo [path/]apachectl -M | grep cgi
Try below url in web-browser:
http://<ip/hostname>:/cgi-bin/test.py eg: http://localhost:1025/cgi-bin/test.py http://localhost:80/cgi-bin/test.py
You will get a page with details on current working directory, command line arguments, etc..
ERROR:
Modules/constants.c: In function ‘LDAPinit_constants’:
Modules/constants.c:158: error: ‘LDAP_OPT_DIAGNOSTIC_MESSAGE’ undeclared (first use in this function)
Modules/constants.c:158: error: (Each undeclared identifier is reported only once
Modules/constants.c:158: error: for each function it appears in.)
Modules/constants.c:380: error: ‘LDAP_CONTROL_RELAX’ undeclared (first use in this function)
error: command ‘gcc’ failed with exit status 1
Solution:
Install openldap24-libs & openldap24-libs-devel :
sudo yum install openldap24-libs-devel
sudo yum install openldap24-libs
Run the below commands and get the unique list of directories from the output:
Add those folders to setup.cfg file in the below section:
[_ldap]
library_dirs = /opt/openldap-RE24/lib /usr/lib
include_dirs = /opt/openldap-RE24/include /usr/include/sasl /usr/include
Now run the installation command:
python setup.py install
NOTE: Below code is Python 3.
base = 26 def transform(row): res = [] for field in [tmp.strip() for tmp in row.split(',')]: ival = 0 power = 0 for c in field[::-1]: ival += pow(base,power)*(ord(c)-ord('A')+1) power += 1 res.append(ival) return res print(transform("A, B, Z, AA, AB, AAA"))
Output:
[1, 2, 26, 27, 28, 703]
T : Number of test cases
M: value of M
N: Number of strings
For each string, ascii value of each character of the string is raised to M and multiplied together. For each test case the sum of above values is ODD or EVEN
Input:
1
10 2
ac ab
Output:
ODD
TIP: Challenge lies in finding the solution without doing all the mathematical operations mentioned in the question. (yes, also called optimizing..!)
Solution:
#!/usr/bin/python T = int(raw_input()) while T>0: T = T - 1 tmp = raw_input() M = int(tmp.split()[0]) K = int(tmp.split()[1]) tmp = raw_input() final_odd = False for s in tmp.split(): odd = None for i in s: if ord(i)%2==0: # the power is even odd = False else: # the power is odd if odd==None: # if odd is True, odd*odd=odd odd = True # odd + odd = even # odd + even = odd # even + even = even if final_odd != odd: final_odd = True else: final_odd = False if final_odd: print "ODD" else: print "EVEN"
When you use re.findall(“(.*):” to get the strings ending with full-colon(:), you may not always get the expected result if your string has multiple full-colons. Lets run the command before we talk.
Concentrate on the commands put inside single colon. (You can neglect the rest as it is part of my effort to avoid creating a python file for running this code)
What happened here is python did a greedy search and found maximum string ending with a full-colon. If that is what you desired, stop reading further.. 🙂
If you expected a list of all the strings ending with full-colon, change your command to re.findall(“(.*?):”
As you can see, I added a question mark which will force python to avoid greedy approach.
Before we part, lets make sure python is ok with this change..
Output:
Question:
Find the biggest black square from an N*N matrix of squares. 1 means it’s black, 0 means it’s white. Output should be the list of squares which forms the largest black square.
Input : ‘{0#1#1#1#0#1#0#1,1#0#1#0#0#0#0#1,0#0#0#1#0#1#0#0,1#1#1#1#1#0#0#1,1#1#1#1#0#1#1#1,1#1#1#1#0#1#1#1,1#1#1#1#1#1#1#1,1#1#0#1#0#0#1#1}’
Output : {(3#0,3#1,3#2,3#3),(4#0,4#1,4#2,4#3),(5#0,5#1,5#2,5#3),(6#0,6#1,6#2,6#3)}
Solution(Python):
# Author : Sreejith Sreekantan # Description : # Find the biggest black square from an N*N matrix of squares. 1 means it's black, 0 means it's white # Input : '{0#1#1#1#0#1#0#1,1#0#1#0#0#0#0#1,0#0#0#1#0#1#0#0,1#1#1#1#1#0#0#1,1#1#1#1#0#1#1#1,1#1#1#1#0#1#1#1,1#1#1#1#1#1#1#1,1#1#0#1#0#0#1#1}' # Output : {(3#0,3#1,3#2,3#3),(4#0,4#1,4#2,4#3),(5#0,5#1,5#2,5#3),(6#0,6#1,6#2,6#3)} #!/usr/bin/python def largestSquareAt(sq, i, j, k=1): if not ( i+k<len(sq) and="" j+k<len(sq[i])="" ):="" ="" ="" return="" 0="" for="" x="" in="" range(0,k):="" if="" sq[i+x][j+k-1]="='0':" y="" sq[i+k-1][j+y]="='0':" 1="" +="" largestsquareat(sq,="" i,="" j,="" k+1) ="" ="" def="" biggestsquare(input1):="" len(input1)="=0" or="" not="" (="" input1.count('{')="=input1.count('}')" )="" :="" ''="" i="str(input1)" j="i" for="" j]="" resx="-1" rexy="-1" ressize="-1" res="[]" range(0,len(j)):="" res.append([])="" range(0,len(j[x])):="" res[x].append(largestsquareat(j,="" x,="" y))="" ressize<res[x][y]:="" resy="y" resstring="" ressize="">0: resstring = "{" for x in range(resx,resx+ressize): if x>resx: resstring += "," resstring += "(" for y in range(resy,resy+ressize): if y>resy: resstring += "," resstring += (str(x)+"#"+str(y)) resstring += ")" resstring += "}" return resstring input = '{0#1#1#0,0#1#1#1,0#1#1#1,1#1#1#1}' input1 = '{0#1#1#1#0#1#0#1,1#0#1#0#0#0#0#1,0#0#0#1#0#1#0#0,1#1#1#1#1#0#0#1,1#1#1#1#0#1#1#1,1#1#1#1#0#1#1#1,1#1#1#1#1#1#1#1,1#1#0#1#0#0#1#1}' input2 = '{0#1#1#1#0#1#0#1,1#0#1#0#0#0#0#1,0#0#0#1#0#1#0#0,1#1#1#1#1#0#0#1,1#1#0#1#0#1#1#1,1#1#1#1#0#1#1#1,1#1#1#1#1#1#1#1,1#1#0#1#0#0#1#1}' input3= '{0#1#1#1#0#1#0#1,1#0#1#0#0#0#0#1,0#0#0#1#0#1#0#0,1#1#1#1#1#0#0#1,0#1#0#1#0#1#1#1,1#1#1#1#0#1#1#1,1#1#1#1#1#1#1#1,1#1#0#1#0#0#1#1}' input4 = '{0#0#0#0,0#0#0#0,0#0#0#0,0#0#0#0}' print biggestSquare('') print biggestSquare(input1) print biggestSquare(input2) print biggestSquare(input3) print biggestSquare(input4)
Question:
Given a string of length N, output “Correct” if brackets close in correct order else output “Incorrect”
Input : ({}[((({{}})[{()}]))])
Solution(Python):
# Author : Sreejith Sreekantan # Description : # Given a string of length N, output "Correct" if brackets close in correct order else output "Incorrect" # Input : ({}[((({{}})[{()}]))]) # #!/usr/bin/python def validString(input1): stk = [] flag = True for x in input1: if x in ['{', '(', '[']: stk.append(x) else: if len(stk)==0: flag = False elif x == '}' and not stk[len(stk)-1]=='{' : flag = False elif x == ')' and not stk[len(stk)-1]=='(' : flag = False elif x == '}' and not stk[len(stk)-1]=='{' : flag = False if not flag: return "Incorrect" else: stk.pop() return "Correct" input1 = '({}[((({{}})[{()}]))])' input2 = '({}((({{}})[{()}]))])' print validString(input1) print validString(input2)
This series of articles, “Pythonize” will serve as an aid for python beginners. In this chapter I will try to explain all about python classes. In this series I will let python code talk to you more than me.
A few interesting things about python for you: Internet giants, Google and Yahoo, maintain a large code-base in python and promotes python to a large extent. Google apps can now be developed in python using the python APIs provided by Google. Python is famous for more functionality in less time being a scripting language and at the same time enabling OOP(Object Oriented Programming). It’s rich with libraries and online help. You can access
Python Class:
Class is a template or prototype that defines the fields and methods common to all objects created with this class. Classes in python starts with the keyword ‘class’ followed by the class name and then an optional set of classes separated with comma inside parenthesis, which will be inherited into your class depth-first and left-to-right order. It is then ended with a full-colon and all the indented block of statements just after that forms the body of the class. Note: Python keywords are case-sensitive. Stick with small letters. Below code won’t compile.
E.g.
FYI: pass is the keyword used to avoid error and it represents an empty block here. Variables: Data type of python variables is set based on the value assigned to it. Some valid variable assigning:
E.g.
Hope you noted that I have reassigned ‘variable1’ from number to string without any extra code and the program worked fine. That is the extent of freedom python gives you. There is no keyword to define the scope of variables in class and all your usual variables declared inside class will be publicly accessible. Let try this with an example.
E.g.
Output:
Recent python versions came up with a solution to declare private variables. You just need to precede your variables with two underscores (__) and magic: it became private. Such private attributes are declared outside __init__ function for data hiding. Lets experiment this with a few lines of code.
E.g.
You may doubt that this is because I tried to print the variable ‘name’ where the class variable is “__name”. Lets clear your doubt.
E.g.
Here we tried to print the same variable name. Since preceding class-variables with two underscores make it private, we got an ‘AttributeError’. Python protects such private members internally by altering its name to contain the class name and we can access such members by following the template:
<objectName>._<className>__<attributeName>
Here our object name is prs, class name is Person and the variable is __name, so as per our assumption we should get the variable value by printing ‘prs._Person__name’
E.g.
Output: Note: Use “self.” with variable names to refer to instance variables and class-name followed by period (.) and then variable name to refer to class variables. Otherwise python will consider it as a global name and raise an error or give unpredictable outputs if a global variable with similar name exists. Lets have one example of class variables and then go to next topic:
E.g.
Output:
Here, you can see that by changing company name of emp2, it got reflected in emp1 as well i.e. emp1 and emp2 points to a single memory and such variables are technically called as class variables; one single memory for all the classes.
Class functions:
Functions of class decide the behavior of the class. They act as an interface to the outside world. Take a look at a sample class with a function ‘print_emp_details’.
E.g.
Constructor:
Object Oriented Programmers might already know the role of constructor in a class. For those who don’t, let me explain. Constructor is the function that gets automatically invoked at the time of object-creation which is usually made use of for resource allocation. Python differs a bit from other OOP languages like java or C++ in constructor-name and it’s invocation. In python,
E.g.
You might be wondering about ‘self’ in __init__ function. In python, you will see this variable as the first argument in all the class functions without which the execution will fail. ‘self’ is not a keyword but instance of a class. You can use any other name for this parameter but it’s part of coding standard, which makes your code readable and easy to understand for other programmers. Lets stick to that standard.
E.g.
A blank parameter list will raise a TypeError:
We have seen only constructors that don’t take any parameters so far. Now lets take a peek into parameterized constructors. As you guessed, the parameters of the constructor follow ‘self’.
E.g.
Here name and hobby are the arguments and during object-creation we passed the arguments string “Sreejith” and “programming” as name and hobby to the constructor. One thing to note is that, with this class you won’t be able to create an object without any parameters. You will get the following error:
E.g.
Output:
Lets see a work around for that. Here we will assign a default value for the constructor variables and thereby the constructor assumes the default value if not explicitly specified at the time of object creation.
E.g.
Destructor:
E.g.
Other base methods you can overload in your class:
__repr__(self)
__str__(self)
__cmp__(self, obj1)
Inheritance:
Lets straight away get into a python code that successfully uses inheritance to reuse code and at the same time overrides needed functions.
E.g.
Output:
Here we override the print_name() function in Child and GrandChild class but uses the same smile() function as such. Functions are searched in the following hierarchy:
There are two functions which comes handy with class inheritance:
issubclass()
isinstance
Thus we have come to the end of this chapter. Lets recall what we have learned from this chapter. We learned
Meet you next time..
Apache hadoop gives you option to program your mapper and reducer in
your favourite language.If you wonder about its possibility, you will
know it by yourself by going through this blog. Since python got into my favourite-language list recently,
let me try with it. Python already has a module: pydoop which
provides you with API to program map reduce. But this time,
we will program without using pydoop, thereby you will get
an idea how you can achieve the same in your preferred programming
language. Apache hadoop comes with a streaming jar which takes as
parameters: your mapper program, your reducer program, input file and
output file. It then streams the data, in your input file, to the
stdin (if I am going a bit too technical here, refer standard streams)
of mapper program. Your mapper program is supposed to read from stdin,
process the data and write to stdout as key-value pairs; its completely
upto you to choose the separator for key and value, since you are
going to get back those key-value data. The stdout of mapper is then
taken by the hadoop-streaming-jar, sorts the data from all mapper's
execution(fyi: mapper program is executed in all the data nodes, the
input file is chunked and stored), sorts that data based on key and
writes to the stdin of reducer program. Your reducer program should be
in such a way that it should read from stdin a key-value pair per line
and do the necessary processing to print out the final processed data.
Now, I will give you a feel of how things are going to work-out with a
character-count program, implemented in python, which will give your
the count of all alphabets in the input data.
Now let me show you my mapper and reducer code:
charcount_mapper.py:
In charcount_mapper.py, I read line by line from stdin and go through
each line character by character and check if it's an alphabet. If it's
an alphabet, I print it to stdout in the format: "<character><tab-space>1". This
means to reducer program that the character appeared one time. Here
the key is <character> and value is '1'(one). There can be multiple
occurrence of same "<character><tab-space>1" depending on the input data.
charcount_reducer.py:
Now lets analyze charcount_reducer.py. In this, I read line by line
from stdin, since its what I wrote into stdout from mapper program, I
can foretell that every line will be of the form:
"<character><tab-space>1". The only difference between what I wrote
into stdout from mapper program and what I get from stdin in reducer
program is that input will be sorted based on keys when I read in
reducer program. This will be helpful for me to construct logic for
reducer program. Now I just need to see if a new key is encountered.
Till then I keep on incrementing the counter. Once a new key is found,
the old key along with the counter value is printed out and counter is
reset. In the for loop, key along with counter value is printed out
only when a new key is encountered. Therefore I add one more print
statement at the end of the program to print out the last key and count.
(In case you are confused about the use of last print statement outside
the for-loop).
Since it will be hard to debug programs in hadoop I will ensure the
functionality of my program locally. I will use a sample input:
You can easily predict the output our program should give out. Lets
see if we can get the same from the program.
Keep eyes on the command used in each screen-shot.
I will use the 'cat' command to print out the contents of the input file.
It is then piped to mapper program.
As I said this will the output of mapper program and to feed into
reducer program, for now, we will have to explicitly sort it.
Now its ready to be fed to the reducer program:
The output is as expected, isn't it!
Now lets run the same program in hadoop setup to see its success.
For that, start the hadoop running the start-all.sh script. (Refer the
part 1 of Big Data series in case of any confusion)
Then we need to copy our sample input file into HDFS file system. Know
the command to do it? Let me help you..
Before that I will create a directory for our use:
Now we have a directory "charcount" in the path /user/thinker/ in the
Hadoop file-system. Lets copy our input file from my local file-system
to hadoop file-system.
Lets ensure that the file's existence and its content:
Now we are sure about our input. Lets further with the execution.
For that, the command is:
To ensure the availability of the hadoop-streaming jar, run the command:
This is the jar which does the job of read the contents of input file
and feeding it to mapper program and ... (rest you already know)
the "-file" says the files which has your programs. The programs need
not be copies to HDFS. I used two "-file" to mention my mapper and
reducer files. "-mapper" mentions the mapper program's file name(only
file-name and not entire path). "-reducer" mentions the reducer program's
file-name. "-input" is used to mention the hdfs-absolute-path of input
file and "-output" mentions the output directory to which the output will
be written to.
Lets see the output of successful execution of the above command:
Listing the contents of /user/thinker/charcount/:
We can see a new directory with the name: sample_output.
Note: The command will fail to execute if you give an already existing
file/folder name with '-output' option.
Lets list the sample_output:
Our output will be in 'part-*' files. Since the output size is very
small in this particular case, we have only one file. The number of
files increases with increase in output size.
Lets print the generated output for final verification.
Even though the output differs in order, the counts are correct. You
can run the program with some other input by changing the file given
with '-input' option. You can use any language to
program your mapper and reducer. Points to be noted are:
Scripting languages with its jvm/interpreter installed in all the datanodes is a must.
In case of compiled languages like C or C++, you will have to compile it first and
the executable file need to be mentioned with '-file', '-mapper' and '-reducer'.
With that, I think I covered almost everything needed for you to kickstart your mapreduce
programming in your preferred language. See you in next part..
Big Data is a Big Deal..