NaturallyStupidPotato

Thursday, July 7, 2016

Python -- Pandas -- Add a new level to the column in DataFrame

Here is the situation, we have a data frame df

X Y Z
0 0.738596 0.852906 0.333422
1 0.820456 0.014704 0.233935
2 0.118291 0.714536 0.176275
3 0.032417 0.819386 0.949590
4 0.739559 0.923865 0.791574

And for some reason, we want to add a new level to the columns. Unlike adding new level to index, it seems that there is no available function in Pandas to handle this problem.

We can of course do it "manually", something like this:

df.columns = pd.MultiIndex.from_arrays(np.vstack((np.asarray(['test'] * df.shape[1]), df.columns.tolist())))

A more general function is

def add_new_level(index, new_level):
arrays = np.asarray(index.get_values().tolist()).T
new_arrays = np.vstack((new_level, arrays))
return pd.MultiIndex.from_arrays(new_arrays)

(not sure if this is the simplest one, have some doubts)

and the data frame becomes:
test
X Y Z
0 0.738596 0.852906 0.333422
1 0.820456 0.014704 0.233935
2 0.118291 0.714536 0.176275
3 0.032417 0.819386 0.949590
4 0.739559 0.923865 0.791574

or we can play some trick like this one:

df = pd.concat([df], axis=1, keys=['test'])

Sunday, June 26, 2016

Programming Problems about Stocks

LeetCod 122: Best Time to Buy and Sell Stock II

Say you have an array for which the ith element is the price of a given stock on day i.

Design an algorithm to find the maximum profit. You may complete as many transactions as you like (ie, buy one and sell one share of the stock multiple times). However, you may not engage in multiple transactions at the same time (ie, you must sell the stock before you buy again).

class Solution {
public:
int maxProfit(vector<int>& prices) {
if (prices.empty()) return 0;

int max_output = 0;

int max_profit_normal = 0;
int max_profit_sell_at_i = prices.back();

for (int i = prices.size()-2; i >= 0; --i) {
max_profit_normal = max(max_profit_sell_at_i - prices[i], max_profit_normal);
max_profit_sell_at_i = max_profit_normal + prices[i];
}

return max_profit_normal;

}
};

LeetCode 123: Best Time to Buy and Sell Stock III

Say you have an array for which the ith element is the price of a given stock on day i.

Design an algorithm to find the maximum profit. You may complete at most two transactions.

Note:
You may not engage in multiple transactions at the same time (ie, you must sell the stock before you buy again).

class Solution {
public:
int maxProfit(vector<int>& prices) {

vector<int> min_prices(prices.size(), 0);

int min_val = numeric_limits<int>::max();

for (int i = 0; i < prices.size(); ++i) {
min_val = min(min_val, prices[i]);
min_prices[i] = min_val;
}

vector<int> max_prices(prices.size(), 0);

int max_val = numeric_limits<int>::min();

for (int j = prices.size()-1; j >=0; --j) {
max_val = max(max_val, prices[j]);
max_prices[j] = max_val;
}

vector<int> max_profit(prices.size(), 0);
int max_profit_so_far = 0;

for (int i = 0; i < prices.size(); ++i) {
max_profit_so_far = max(max_profit_so_far, prices[i] - min_prices[i]);
max_profit[i] = max_profit_so_far;
}

vector<int> max_profit_from_right(prices.size(), 0);
int max_profit_so_far_from_right = 0;

for (int i = prices.size()-1; i >= 0; --i) {
max_profit_so_far_from_right = max(
max_profit_so_far_from_right, max_prices[i] - prices[i]);
max_profit_from_right[i] = max_profit_so_far_from_right;
}

int output = 0;
for (int i = 0; i < prices.size(); ++i) {
output = max(output, max_profit[i] + max_profit_from_right[i]);
}

return output;

}
};

LeetCode 309: Best Time to Buy and Sell Stock with Cooldown

Say you have an array for which the ith element is the price of a given stock on day i.

Design an algorithm to find the maximum profit. You may complete as many transactions as you like (ie, buy one and sell one share of the stock multiple times) with the following restrictions:

You may not engage in multiple transactions at the same time (ie, you must sell the stock before you buy again).
After you sell your stock, you cannot buy stock on next day. (ie, cooldown 1 day)

class Solution {
public:
int maxProfit(vector<int>& prices) {
if (prices.empty()) return 0;

int n = prices.size();
vector<int> maxProfit(n, 0);

for (int i = n-2; i >= 0; --i) {

int maxProfit_if_buy_at_i = numeric_limits<int>::min();

for (int j = i+1; j < n; ++j) {
maxProfit_if_buy_at_i = max(maxProfit_if_buy_at_i,
prices[j] - prices[i] + (j + 2 < n ? maxProfit[j + 2] : 0));
}

maxProfit[i] = max(maxProfit[i+1], maxProfit_if_buy_at_i);
}

return maxProfit[0];

}
};

LeetCode 188: Best Time to Buy and Sell Stock IV

Say you have an array for which the ith element is the price of a given stock on day i.

Design an algorithm to find the maximum profit. You may complete at most k transactions.

Note:
You may not engage in multiple transactions at the same time (ie, you must sell the stock before you buy again).

class Solution {
public:
int maxProfit(int k, vector<int>& prices) {

if (prices.empty()) return 0;

int ndays = prices.size();

if (k >= prices.size()/2) {
int res = 0;
for (int i = 1; i < prices.size(); ++i)
res += max(prices[i] - prices[i-1], 0);
return res;
}

vector<int> B(ndays,0);
vector<int> S(ndays,0);

int max_val_B;
int max_val_S;

for (int kk = 0; kk < k; ++kk){
//cout << "===== " << kk << " =====" << endl;
max_val_B = numeric_limits<int>::min();
max_val_S = numeric_limits<int>::min();
for (int i = 0; i < ndays; ++i) {
max_val_B = max(max_val_B, S[i]-prices[i]);
max_val_S = max(max_val_S, max_val_B + prices[i]);
B[i] = max_val_B;
S[i] = max_val_S;
}

}

return S.back();

}
};

Tuesday, June 21, 2016

Late Binding Behavior in Python

One interesting mistake mentioned in https://www.toptal.com/python/top-10-mistakes-that-python-programmers-make is the late binding behavior in Python.

Here is the problem, suppose we want to have the following function:

def create_multiplier():
return [lambda x: x * i for i in range(5)]

for func in create_multipler():
print(func(2))

We expect the code to print 0,2,4,6,8 while the actual output is 8,8,8,8,8. As explained in the post
"This happens due to Python’s late binding behavior which says that the values of variables used in closures are looked up at the time the inner function is called. So in the above code, whenever any of the returned functions are called, the value of i is looked up in the surrounding scope at the time it is called(and by then, the loop has completed, so i has already been assigned its final value of 4)."

And the author propose a hack solution to the problem:

>>> def create_multipliers():
...     return [lambda x, i=i : i * x for i in range(5)]
...
>>> for multiplier in create_multipliers():
...     print multiplier(2)
...
0
2
4
6
8

There is another solution to handle this problem, we define the create_multipliers as:

>>> def create_multipliers():
...     return (lambda x: x * i for i in range(5))

Instead of returning a list we return a generator. In this case, when the function is called, the i in range(5) only moves one step ahead.

>>> def create_multipliers():
...    return (lambda x : x * i for i in range(5))
>>> for func in create_multipliers():
...    print(func(2))
0
2
4
6
8

Sunday, June 19, 2016

Implementation of Singleton in Python

This is a simple implementation of the singleton in Python

class Singleton(type):
def __new__(cls, name, bases, namespace):
namespace['_instance'] = None
return super(Singleton, cls).__new__(cls, name, bases, namespace)

class SingletonBase(metaclass=Singleton):

def __new__(cls, *args, **kwargs):
if cls._instance is None:
print("Create new instance for class {}".format(cls.__name__))
instanceof = super(SingletonBase, cls).__new__(cls)
cls._instance = instanceof
return instanceof
else:
print("The class {} has been instantiated".format(cls.__name__))
return cls._instance

class MyClass(SingletonBase):
def __init__(self,val):
self._val = val

if __name__ == '__main__':
x = MyClass(10)
y = MyClass(20)
print("pointer of x: {}".format(x))
print("pointer of y: {}".format(y))
assert x is y
print("x._val: {}".format(x._val))
print("y._val: {}".format(y._val))

Tuesday, June 14, 2016

Implementation of immutable class in Python 3.5

# In this script, we will implement an Immutable class.
# The implementation is developed with Python 3. In order to run the code with Python2.7, we need to change the
# __setattr__ function in ImmutableBase to _setattribute__ function .

class ImmutableMeta(type):
def __new__(cls, name, bases, namespace):
namespace['_instances'] = []
return super(ImmutableMeta, cls).__new__(cls, name, bases, namespace)

class ImmutableBase(metaclass=ImmutableMeta):

def __setattr__(self, name, val):
if self in self._instances:
raise Exception("The class is immutable.")
self.__dict__[name] = val

def __getattribute__(self, item):
if item == '__dict__' and self in self._instances:
raise Exception("Not allowed to access __dict__.")
return super(ImmutableBase, self).__getattribute__(item)

def __init__(self):
self._instances.append(self)

class MyImmutableClass(ImmutableBase):
def __init__(self, x):
self.x = x
super(MyImmutableClass, self).__init__()

if __name__ == '__main__':
instanceof = MyImmutableClass(10)
instanceof.x = 200

Friday, May 13, 2016

一道关于飞机加油的问题

问题：
每个飞机只有一个邮箱，飞机之间可以互相加油并且一箱油可供一架飞机绕地球飞半圈，问为使至少一架飞机绕地球一圈飞回起飞的地方，至少需要出动几架飞机？（所有飞机从同一机场起飞，而且必须安全返回机场，不允许中途降落，中间没有飞机场）

不知道这道题的正确解是多少，这里给出4架安排的方法。

为方便起见先给地球标上坐标。从起点开始自西向东分别标记为 0->1->2->3->4->5->0

plane A: from 0 to 1, oil = 1/2 - 1/6 = 2/6 when x = 1
plane B: from 0 to 1, oil=2/6 when x = 1

1/6 oil from B to A, now, plane A oil = 1/2, plane B oil = 1/6

plane B back to start point
plane A fly to x = 4

meanwhile

plane B from 0 to 5, oil = 2/6 at x = 5
plane C from 0 to 5, oil = 2/6 at x = 5

1/6 oil from B to C, now plane C has 3/6 oil and plane B has 1/6 oil

plane B back to start point
plane C from 5 to 4 with 2/6 oil at x = 4

at this point, plane C meets plane A

1/6 from C to A, now both A and C has 1/6 oil

meanwhile,
plane B and plane D starts from 0 and fly to x = 5
at x = 5, both B and D has 2/6 oil

also at x = 5, B and D meet A and C

B and D give 1/6 to A and C
and all of them can fly back to the start point

===================

I have no idea if we can do this with only three planes. It seems that the plane D is not fully used...

Longest Palindrome Substring O(n) Algorithm

Here is a very detailed explanation of the O(n) algorithm: https://www.felix021.com/blog/page.php?pageid=6

The key ideas are :

We look for the center of the palindrom
when scan the string we want to use the previous calculation
palindrome is symmetric and we want to take advantage of this symmetric structure

This algorithm is a combination of Dynamic Program and Pruning.

Here is the implementation:

int longestPalindromeSubstring(string inStr) {
    
    // insert "space" between charaters in the orignal string.
    // this is a very useful pre-processing; it unifies the odd-length and even-length palindrom substring
    // It is obvious that we should use a character that does not appear in the original string.
    string s;
    for (int i = 0; i < inStr.size(); ++i) {
        s = s + '#';
        s = s + inStr[i];
    }
    s = s + '#';
    
    
    
    vector<int> P(s.size(), 0);
    
    // Here is the definition of the op_center and op_length:
    // op_center + op_length is the right-most boundary of the known palindrome substring.
    
    int op_center = -1;
    int op_length = 0;
    int record = numeric_limits<int>::min();
    
    // now, we scan the string and update op_center and op_length as well as the maximum lenght
    // of the palindrome substring.
    
    for (int i = 0; i < s.size(); ++i) {
        int curr_length = 0;
        if (i <= op_center + op_length) {   // when current position is covered by a known palindrome substring
            int j = 2 * op_center - i;      // here we take advantage of the symmetric structure
            curr_length = P[j];
        }
        
        // perform the two direction search
        while (i + curr_length < s.size() && i - curr_length >= 0 &&
               s[i+curr_length] == s[i-curr_length]) {
            ++curr_length;
        }
        
        P[i] = --curr_length;
        
        // update the record
        record = max(record, curr_length);
        if (i + curr_length > op_center + op_length) {
            op_center = i;
            op_length = curr_length;
        }
        
    }
    
    return record;
}

Labels