In this talk, we will present new Jacobi-like relative value iteration (RVI) algorithms for the ergodic risk-sensitive control problem of discrete-time Markov chains, and the associated Q-learning algorithms. In the case of finite state space, we prove the iterates of the new RVI algorithms converge geometrically, and in the case of countable state space, we prove the convergence of the appropriately truncated problem. We employ the entropy variational formula in order to tackle the multiplicative nature of the risk-sensitive Bellman operator, albeit with an additional optimization problem over a corresponding set of probability vectors. We then discuss the entropy-based risk-sensitive Q-learning algorithms corresponding to the existing and new Jacobi-like RVI algorithms. These Q-learning algorithms have two coupled components: the usual Q-function iterates and the new probability iterates arising from the entropy-variational formula. We prove the convergence of the coupled iterates by investigating the multi-scale stochastic approximations for these iterates.