I study single-agent learning problems under memory constraints. The first chapter studies time consistency issues in a general class of stationary dynamic environments called Markov Decision Processes with Partial Observation. The agent is restricted to use plans with a fixed memory size, that is, strategies that can be implemented by a finite automaton of fixed size. As this induces a game with absent-mindedness, the ex-ante optimal strategy may not be time-consistent. I find that any ex-ante optimal bounded memory strategy satisfies a weaker form of time consistency, multi-self consistency, a la Piccione & Rubinstein 1997). This means that the agent would not want to deviate from the ex-ante optimal strategy for the current period, assuming he will follow the original strategy from tomorrow on. In the second chapter, I analyze the effects of memory limitations on the endogenous learning behavior of an agent in a standard two-armed bandit problem. I find that under memory constraints, the inclination to choose the currently better alternative does not constrain learning: there is no exploitation/exploration trade-off. Optimally, the memory states reflect the magnitude of the relative ranking of alternatives. After a high payoff from one of the alternatives, the agent optimally moves to a memory state with more pessimistic beliefs on the other, even though no information about the latter alternative is received. For the case where one alternative is substantially more informative than the other, he chooses the latter only for myopic exploitation purposes, and ignores any information about it, suggesting specialization in learning. For the special case with one known safe) alternative, a sufficiently patient agent never ceases experimentation and tries the unknown alternative at least occasionally after any history； this is counter to what theory predicts with unbounded memory, but in agreement with experimental findings. Furthermore, he chooses the safe alternative with more optimistic beliefs than the optimal unbounded memory) cutoff belief, again in conformity with experimental evidence.