C-Rella, JorgeMartínez Rego, DavidVilar, Juan M.2025-05-122025-05-122025-02-04J. C-Rella, D. Martinez Rego, y J. M. Vilar, «Cost-sensitive reinforcement learning for credit risk», Expert Systems with Applications, vol. 272, p. 126708, may 2025, doi: 10.1016/j.eswa.2025.1267081873-67930957-4174http://hdl.handle.net/2183/41967[Abstract]: Credit risk problems are dynamic because customer behavior is not stable, and they are cost-sensitive because the impact of a decision depends on the amount of the loan. Online learning algorithms, which evolve as more information becomes available, are an appropriate tool to study these dynamic problems. However, only information on approved transactions is available, which can lead to unfair biases and opportunity costs. Within reinforcement learning, bandit algorithms address this by balancing exploitation (acting according to the current model) and exploration (considering an action with limited information to improve predictions). The only remaining gap is to address the problem taking into account the different classification costs. This paper introduces cost-sensitive reinforcement learning algorithms to solve the credit risk problem from a dynamic perspective maximizing long-term benefits, proposing a cost-sensitive passive-aggressive algorithm and a cost-sensitive logistic bandit. Experiments on benchmark datasets and extensive simulation studies demonstrate the effectiveness and efficiency of the proposed algorithmsengAtribución 3.0 España© 2025 The Authorshttp://creativecommons.org/licenses/by/3.0/es/Cost-sensitive classificationReinforcement learningBandit algorithmsOnline learningCredit riskDecision makingCost-sensitive reinforcement learning for credit riskjournal articleopen access