Both, either, or neither? Taste-based and statistical discrimination in personal loan applications

1. Introduction

Discrimination in loan applications remains a widespread concern worldwide, where decisions are often influenced by attributes such as gender, caste, or income rather than solely by creditworthiness. Prior research documents that even after accounting for financial profiles, certain groups face higher loan rejection rates or less favourable terms (Bartlett et al., 2021).

This issue is particularly salient for personal loans, which rely on limited financial data, thus allowing greater scope for discretion and bias. Furthermore, the typically smaller loan sizes relative to business or mortgage loans increase the likelihood of discretionary lending decisions. Even approved personal loans may entail discrimination through higher interest rates or shorter repayment periods (Ross & Yinger, 2002). Supporting this, Beck et al. (2018) find that borrowers matched with loan officers of the opposite sex often face higher rates.

In India, caste and gender biases significantly affect access to formal credit (Kumar & Venkatachalam, 2019), with women disproportionately impacted due to a lack of collateral, credit history, or documented income (Deshpande & Sharma, 2016). Prior studies have focused on small business or agricultural credit, but personal loans, a rapidly growing segment—doubling in size from INR 27 trillion (≈$ 365 billion) in 2020 to INR 53 trillion (≈$ 635 billion) by 2024 (based on average exchange rates during the year)—remain understudied (RBI, 2024). These loans encompass a broad spectrum, ranging from shorter-term categories such as consumer durables, unsecured loans, and vehicle loans to longer-duration categories, including housing and collateralised (e.g., gold) loans. Second, in 2022, the average ticket size of personal loans (excluding home loans) ranged from INR 0.2 to 4.2 lac across financial entities, indicating wide dispersion (CRIF High Mark, 2022). Given the multiplicity of categories and the significant heterogeneity in borrower profiles, this raises the question of whether banks discriminate in extending such loans.

Using detailed loan-level data for a public sector bank from a central credit bureau, this study addresses two key questions: (1) Is there evidence of gender-based discrimination in personal loan applications? (2) If so, does this discrimination vary with borrower characteristics and across loan categories? Our findings contribute to one of the earliest comprehensive analyses of personal loan discrimination in India.

The baseline results suggest that women with higher credit scores are less likely to face discrimination, with an average marginal effect of a 0.7% reduction in loan rejection risk after controlling for other factors. This effect varies by age group, credit score, and the applicant’s income relative to peers.

In this context, the Indian Constitution and financial regulations provide a strong legal framework to prevent discrimination. For example, Article 15(1) prohibits the State from discriminating on grounds of religion, race, caste, sex, or place of birth (vertical anti-discrimination). Relatedly, Article 15(2) extends this obligation to citizens’ access to public spaces (horizontal anti-discrimination). These protections are complemented by guarantees of equality before the law (Article 14) and equal opportunity in public employment (Article 16).

From a financial perspective, the Reserve Bank of India (RBI)’s Fair Practices Code (FPC) mandates that lenders must not discriminate based on sex, caste, or religion in lending (RBI, 2003). The Code also requires that (1) all communications be conducted in the vernacular or a language understood by the borrower (addressing linguistic discrimination), (2) loan application forms include essential information affecting borrowers’ interests (addressing information discrimination), and (3) uniformity exists in loan terms and conditions (addressing both price- and non-price-based discrimination). The recent RBI Digital Lending guidelines 2025 modernise the FPC's borrower protection, promoting accountability among concerned service providers while aligning with the FPC's emphasis on board-approved codes to foster trust and uniformity in lending (RBI, 2025).

These guidelines align with international standards. For instance, the U.S. Equal Credit Opportunity Act (1974) prohibits discrimination against credit applicants based on race, colour, religion, national origin, sex, marital status, or age. Similarly, the European Union’s Directive 2000/43/EC enshrines equal treatment regardless of race or ethnicity, covering access to goods and services broadly. In the UK, the Guide to Credit Scoring (Para 2.4) explicitly states that credit scoring must not discriminate on grounds of sex, race, religion, disability, or colour.

In the remainder of the paper, we summarise the literature (Section 2), followed by the data and methods (Section 3), the results (Section 4), and policy remarks, including some research limitations, in the final section.

2. Literature Review

From an economic perspective, discrimination manifests in three primary forms. The first is taste-based discrimination, which is subjective and driven by personal preferences or biases (Becker, 1971). In contrast, statistical discrimination is more objective and arises when lenders cannot ideally assess an individual borrower’s creditworthiness (Phelps, 1972; Arrow, 1973). Instead, they rely on prior beliefs about the average credit risk of the borrower’s group as a proxy for individual risk. While taste-based discrimination deviates from profit maximisation principles, statistical discrimination can coexist with them.

A more recent concept, implicit discrimination, captures unconscious biases that influence decisions without deliberate intent to discriminate. This occurs when lenders unknowingly rely on social stereotypes rather than objective information when evaluating applications (Bertrand et al., 2005).

A broad body of research has explored these discriminatory behaviours. In the United States, Blanchflower (2003) documents discriminatory lending against Black-owned businesses. Other studies report discrimination based on race and gender (Petersen, 1981; Ross & Yinger, 2002; Li, 2018). Beyond the U.S., De Andres et al. (2021) provide evidence from Spain supporting implicit discrimination beyond traditional hypotheses. Similar patterns have been observed across diverse jurisdictions, including Trinidad and Tobago (Storey, 2004), Italy (Bellucci et al., 2010), China (Jiang et al., 2024), and in cross-country analyses (Fafchamps, 2000; Ongena & Popov, 2016). Recently, Garcia et al. (2024) investigated how machine learning algorithms in credit scoring can perpetuate biases, leading to discriminatory outcomes against certain demographic groups despite their creditworthiness. It highlights the need for regular audits, transparent model design, and integration of alternative data to mitigate biases while maintaining predictive accuracy.

Similar to taste-based discrimination, numerous studies have also explored statistical discrimination. Han (2004) distinguished between these two forms and found limited evidence supporting taste-based discrimination in U.S. mortgage lending. Other research confirms the presence of statistical discrimination (Munnell et al., 2006; Dymski, 2006; Bartlett et al., 2019), including studies that account for digital factors in credit evaluation (Niu et al., 2019; Wu et al., 2023; Trinh, 2024).

In the Indian context, discriminatory lending practices have been documented across various dimensions. Kumar (2023) provides a comprehensive summary of credit discrimination, highlighting in particular the bias in datasets and fairness metrics, while Banerjee and Knight (1985), Prakash (2010), Jodhka (2010), and Siddique (2011) highlight caste-based discrimination, and Kumar (2013) finds similar patterns in agricultural loans. Fisman et al. (2017) highlight the importance of religion in facilitating loan repayments. Research has also documented discrimination across manifold dimensions, including caste, gender, and religion (Banerjee & Munshi, 2004; Guerin et al., 2013; Iyer et al., 2013; Ghosh, 2023). A recent report highlights discrimination across various spheres, including employment, agricultural credit, and health (Oxfam, 2022). Mishra et al. (2022) focus on credit scoring and algorithmic bias, highlighting the fact that public banks often hesitate to use credit bureau data (“hard information”), preferring soft information for borrower evaluation, a phenomenon termed the ‘relationship dilemma’. However, none of these studies investigates the potential for such discrimination in personal loan applications.

That said, the evidence is not unambiguous. Several studies report that female-owned businesses may secure larger loans on more favourable terms (Hansen & Rand, 2014; Pham & Talavera, 2018) or even experience no discrimination at all (Corsi & de Angelis, 2017).

Our contribution to the literature is threefold. First, we investigate the presence and nature of discrimination in consumer credit, distinguishing between taste-based and statistical discrimination. This aligns with a growing body of research documenting discriminatory lending practices (Du and Zeng, 2019; Atkins et al., 2022). However, unlike most existing studies that focus on business or mortgage lending, our analysis centres on consumer credit, a rapidly expanding segment within the Indian banking sector that has received limited scholarly attention.

Second, our work contributes to the broader literature examining the role of gender in economic outcomes. Prior research has underscored gender-based differences in access to finance (Demirgüç-Kunt et al., 2022; Qi et al., 2022; Ghosh, 2025), credit allocation (Chaudhuri et al., 2020), and macroeconomic variables such as inflation expectations and responses (D’Acunto et al., 2021). Unlike these studies, we focus specifically on gender dynamics in consumer lending while carefully controlling for confounding borrower characteristics.

Third, a key novelty of our study lies in the use of granular, loan-level data to analyse borrower behaviour, an area that has gained traction in recent empirical research (Van den Berg et al., 2015; Beck et al., 2018; Blanco-Oliver et al., 2021; Bertrand & Burietz, 2023). For instance, Van den Berg et al. (2015) find that female loan officers in Mexico are less likely to ensure loan repayment among microfinance clients. Beck et al. (2018) report substantial differences in loan terms and borrower experiences when the loan officer and applicant are of opposite genders. Blanco-Oliver et al. (2021), using international data, show that female loan monitoring may weaken microfinance portfolio quality. Bertrand & Burietz (2023) demonstrate that exclusive reliance on hard information results in higher spreads charged to otherwise creditworthy borrowers than approaches that also consider soft information. Our research contributes to this literature by utilising detailed personal loan-level data within a single-country context to offer fresh insights into the nature and extent of discrimination in consumer credit.

3. Data and Methods

We utilise proprietary, cross-sectional data from a leading credit bureau in India for 2023. The data includes loan-level information on 51,336 customers of a large public-sector bank, encompassing two sets of variables: one at the bank level and the other at the bureau level. The former includes details such as the percentage of active accounts and the count of various loan categories (e.g., consumer loans, gold loans, home loans). The latter includes a much richer set of variables, including whether a loan has been approved, as well as individual-level controls for gender, age, marital status, monthly income, and time with the current employer.

To contextualise the process, credit bureaus aggregate data on individual credit transactions from banks and other financial institutions. These data encompass both new and outstanding loan applications, including unsecured loans and their repayment histories. In some cases, information on mobile and utility bill payments is also incorporated. This transactional data is juxtaposed with borrower-specific characteristics, such as gender, employment status, and income, to generate a credit information report and assign a credit score, ranging from 300 to 900. Higher scores denote greater creditworthiness. Generally, scores of 750 and above are classified as ‘excellent’, those between 650–749 as ‘good’, and scores from 600–649 as ‘fair’. Scores below 600 indicate elevated credit risk. In terms of disaggregation by gender, 184 observations belong to the ‘excellent’ category with no female applicants, 48448 observations belong to the ‘good’ category with 12% female applicants, 2577 observations belong to the ‘fair’ category with 11.4% female applicants, and the rest are in elevated credit risk with no female applicants.

These scores serve dual purposes. From the demand side, they condense a borrower’s financial history into a single metric, incentivising prudent financial behaviour and enabling risk-based loan pricing. From the supply side, they facilitate informed lending decisions, enhance credit screening, and improve loan performance by reducing default rates.

Table 1 summarises the variables employed and their summary statistics. At the bank level, we find that 58% of accounts are, on average, active, with the average age of the newest account opened being approximately 16 months. Among loan accounts, other loans dominate, with nearly 77% of borrowers reporting a positive balance; housing loans are the lowest, with an average of 5% of borrowers reporting a positive balance.

Variable (notation) Obs. Mean (SD)
Bank-level Per cent of active accounts (Pct_ACTV) 51336 0.577 (0.380)
Dummy=1, if a borrower reports a positive count of housing loan accounts (HSNG_ACC) 51336 0.052 (0.224)
Dummy=1, if a borrower reports a positive count of other loan accounts (OTH_ACC). Other loans include automobile, consumer goods, and other personal loans 51336 0.768 (0.422)
Dummy=1, if a borrower reports a positive count of unsecured loan accounts (UNSEC_ACC) 51336 0.650 (0.477)
Dummy=1, if a borrower reports a positive count of gold loan accounts (GOLD_ACC) 51336 0.271 (0.445)
Bureau-level Credit score (CSCORE): Natural logarithm of credit score 51336 6.521 (0.030)
Loan rejection (REJECT): Dummy=1 if a loan was rejected 51296 0.347 (0.476)
Loan rejection (REJECT 3M): Dummy=1 if a loan was rejected in the past 3 months 51296 0.223 (0.416)
Loan rejection (REJECT 6M): Dummy=1 if a loan was rejected in the past 6 months 51296 0.424 (0.494)
Loan rejection (REJECT 12M): Dummy=1 if a loan was rejected in the past 12 months 51296 0.653 (0.476)
AGE: Natural logarithm of age 51336 3.486 (0.252)
Gender (FEM, %): Dummy=1 if the respondent is female, else zero 51336 11.865 (32.34)
Marital status (MARRIED): Dummy=1 if the respondent is married, else zero 51336 0.735 (0.441)
Education (EDU): Categorical variable. Equals one if Others, 2 if secondary, 3 if higher secondary, 4 if undergraduate, 5 if graduate, 6 if post-graduate and 7 if professional 51336 3.614 (1.383)
Relative income (REL_INC): Income of an applicant/ Average income 51336 1.001 (0.758)
Time (in months) with current employer: Natural logarithm of number of months (Ln EMPL) 51334 4.489 (0.678)
Table 1.Variable definition and summary statistics

At the bureau level, the average credit score is 680, which translates into a natural logarithm of 6.5. On average, nearly 35% of loan applications were rejected, with the highest rejection rates observed in the past 12 months. The average age of a borrower is 34 years; female borrowers comprise 12% of the total, and the relative (net) monthly income equals 1, the same as the overall sample.

Figure 1 presents the averages of the key variables, overall and by age bucket. We consider three age buckets: low (21-35 years, Y), middle (36-50 years, Y), and high (above 50 years, Y), reflecting the life-cycle hypothesis. The subsequent analysis also takes this into account. The brackets under each column provide the average age, the total number of females, and the total number of applicants in that category. Thus, the average applicant age for the overall sample is 33.8 years, and there are 6088 females, with the majority (3,853, or 63%) falling within the middle-aged category. The average credit score is 680, and it increases as one graduates into higher age buckets. The right-hand scale (rhs) plots the average income. Here again, it is the highest in the above-50-year bucket and the lowest in the 21-35-year bucket.

Figure 1. Average values of key variables. Note. The figure shows the average values of the key variables employed in the analysis. The variable definitions and notations have been provided in Table 1.

3.1. Dependent variable

The primary outcome variable in our analysis is Reject, which equals 1 if a loan application is rejected and 0 otherwise. To ensure robustness, we also employ alternative definitions of the outcome variable that differ in temporal stringency. Specifically, we consider rejections within the past three months (M) the most stringent and those within the past twelve months the least rigorous. From an economic perspective, a recent rejection is more likely to reflect ongoing or unresolved concerns about creditworthiness. In contrast, a rejection further in the past may pertain to circumstances that have since changed or become irrelevant.

3.2. Independent variables

The first principal independent variable is the applicant’s credit score (measured in natural logarithms). The use of natural logarithms is premised on two considerations. First, as our previous discussion indicates, the distribution of raw scores is skewed, with several borrowers having excellent scores and, likewise, a tail of borrowers having high credit risk. Using the natural logarithm helps to transform this skewness into one that approximates a normal distribution. Second, using logarithms also facilitates interpreting coefficients in percentage terms, regardless of their raw values. The other two key independent variables are relative income and a binary indicator denoting female gender. We utilise relative income, rather than absolute income, for three reasons. First, the reported income figure reflects net (post-deduction) earnings, thereby accounting for any existing financial obligations or encumbrances the applicant may have. Second, income is a fundamental determinant of credit history; all else being equal, individuals with higher income are more likely to maintain better credit scores due to their greater capacity to meet repayment obligations. Third, from a lender’s perspective, an applicant’s income is likely assessed relative to the income distribution of a comparable peer group, because lenders might face underreporting of income (e.g., Cowling et al., 2020; Montaya et al., 2020). A higher relative income ratio may thus serve as a more informative indicator of creditworthiness than absolute income.

3.3. Other controls

We control for other applicant characteristics, such as age, marital status, education, and the duration of employment with the current employer (in months). These variables are akin to those considered in prior research and include whether an applicant is an entrant to the job market, as well as human capital and work experience (Black et al., 1978; Jiang et al., 2024).

The correlation matrix for the key variables is presented in Table 2. Most correlations with the outcome variable are extremely low, with a maximum of 10% (for age). Multicollinearity is therefore less of a concern.

1 2 3 4
1 Reject
2 LN(CSCORE) 0.071***
3 FEM -0.009** -0.0007
4 REL_INC -0.035*** 0.017*** -0.043***
5 Ln Age 0.091*** 0.269*** 0.099*** 0.085***
Table 2. Correlation matrix of key variables. Note. The table shows the bivariate correlation matrix for the key variables employed in the analysis. The variable definitions and notations have been provided in Table 1. ***p<0.01

3.4. Empirical strategy

To assess the impact of gender and income on loan rejection among applicants (denoted by h) with varying credit scores, we estimate a logit specification, as in Eq. (1).

(1) $$\text{Logit}(Reject_h = 1) = \alpha + \beta\, CSCORE_h + \gamma\, FEM_h + \mu\, (CSCORE_h \ast FEM_h) + \delta\, REL\_INC_h + \lambda X_h + \varepsilon_h$$

The variable notation is provided in Table 1; X is a vector of applicant-specific controls, including age, education, employment history, marital status, and relative income, and ε is a random error term. The two key coefficients are µ and δ. The former shows the impact of credit scores on female borrowers who are rejected for credit. A statistically significant coefficient would support (gender-based) taste discrimination. The latter shows the impact of the applicant’s income (relative to peers) on the likelihood of loan rejection. A statistically significant coefficient would provide evidence in favour of (income-based) statistical discrimination. We report the Average Marginal Effect (AME) for the key coefficients, which combines the marginal contributions of the independent variables into a single (average) number. This number indicates the variable's importance for predicting loan rejection (i.e., the dependent variable).

4. Results and discussion

4.1. Main findings

Table 3 presents the estimation results. In Column 1, the coefficient on the interaction term (μ) is negative and statistically significant. The corresponding Average Marginal Effect (AME) indicates that female applicants with higher credit scores are 0.7% less likely to be rejected for a loan. This finding offers no support for taste-based discrimination; instead, it points to favourable treatment of women, consistent with recent evidence from credit markets (Jiang et al., 2024; Card et al., 2022). In a global sample, Hermes and Lensink (2011) show that a higher proportion of female clients in microfinance institutions was associated with lower portfolio risk. Prior evidence from India also shows that self-help groups run by women exhibit higher reliability, as evidenced by lower default rates, thereby increasing their likelihood of credit uptake (Srivastava, 2013).

Variable Overall 600-649 (fair) 650-749 (good)
(1) (2) (3) (4)
CSCORE -0.941*** (0.338) -0.901*** (0.305) 0.876 (0.548) -1.459*** (0.550)
AME -0.194 -0.188 0.211 -0.027
FEM 0.206*** (0.061) 0.275*** (0.102) -0.191* (0.108) 0.623 (0.631)
AME 0.043 0.067 -0.029 0.016
CSCORE*FEM -0.032*** (0.009) -0.039** (0.017) -0.043**(0.019)
AME -0.007 -0.018 -0.006
REL_INC -0.242*** (0.026) -0.193*** (0.074) -0.177*** (0.036) -0.258*** (0.033)
AME -0.050 -0.039 -0.041 -0.06 7
FEM*REL_INC -0.273 (0.189)
AME -0.028
Pct_ACTV -1.584*** (0.028) -1.107*** (0.036) -1.202*** (0.068) -0.445*** (0.165)
AGE 0.296*** (0.045) -0.067 (0.339) -0.253 (0.983) 0.374 (0.237)
EDU -0.023*** (0.007) -0.054*** (0.006) -0.384*** (0.162) -0.152** (0.067)
MARRIED -0.009 (0.022) 0.047 (0.035) -0.054 (0.049) 0.106 (0.126)
Ln EMPL -0.133*** (0.016) -0.263*** (0.114) -0.169*** (0.038) -0.258** (0.124)
Obs. 51,334 51,334 48,446 2,577
Log pseudo-likelihood -30,856 29,987 -19,455 -11,845
McFadden R-sq. 0.068 0.052 0.025 0.077
Table 3. Main findings. Note: The table shows the results from estimating equation 1. The dependent variable is 1 if a loan is rejected and 0 otherwise. The variable definitions and notations are in Table 1.Standard errors (clustered by customer) in parentheses ***p<0.01;** p<0.05; *p<0.10

The coefficient on relative income (δ) is also negative and statistically significant, with an AME of -0.05. This implies that applicants with higher income—relative to their peer group—are 5% less likely to face rejection. The interpretation is intuitive: households with greater disposable income after meeting other obligations are perceived as lower-risk borrowers, thereby reducing the likelihood of rejection (Leika & Marchettini, 2017). This supports the presence of statistical discrimination in the lending process.

Next, we estimate the baseline model and, in addition, incorporate the interaction of gender with income. If high-income women are more likely to be rejected for loans, the interaction term would be negative and statistically significant, supporting statistical discrimination against female borrowers. In column 2, the coefficient is not statistically significant, suggesting limited evidence of statistical discrimination against females.

We estimate the model separately by threshold effects at conventional credit score cutoffs. Based on our previous discussion, we employ the two buckets, which involve female borrowers, labelled earlier as ‘good’ and ‘fair’ categories. In both instances (columns 3-4), we find evidence in favour of the favourable treatment of women as loan applicants, consistent with our main findings. As well, statistical discrimination appears to be an important consideration for both categories, although its relevance is more pronounced for the ‘fair’ category. Intuitively, a ‘fair’ credit score creates an inconsistent signal that can be a ‘red flag’, often reflecting a history of missed payments and poor financial habits. From a lender's perspective, this mismatch reflects moral hazard: a lack of financial discipline that overwhelms the higher income, compelling the lender to exercise prudent risk management practices.

Other control variables yield expected results. Longer employment tenure and higher educational attainment enhance the chances of loan approval. Similarly, a higher share of active credit accounts serves as a proxy for a robust credit history, further improving application outcomes.

To sum up, the evidence supports statistical discrimination, though in the case of taste-based discrimination, it runs contrary to established convention.

4.2. Additional checks

In Table 4, we estimate the baseline specification separately by age buckets. As noted earlier, we stratify the analysis by age cohorts (indicated at the top of each column) to examine heterogeneity in the response. For brevity, we highlight only the key Average Marginal Effects (AMEs).

Variable Age: 21-35 Y(low) Age: 36-50 Y(middle) Age: Above 50 Y(high)
(2) (3) (4)
CSCORE 7.211*** (0.612) -1.842***(0.448) -5.392*** (1.281)
AME 1.392 -0.394 -1.229
FEM 0.342*** (0.129) 0.145* (0.079) 0.100 (0.218)
AME 0.066 0.031
CSCORE*FEM -0.052***(0.019) -0.022* (0.012) -0.015(0.033)
AME -0.010 -0.005
REL_INC -0.272*** (0.041) -0.222*** (0.033) -0.186 (0.125)
AME -0.053 -0.048
Pct_ACTV -1.575*** (0.042) -1.647*** (0.040) -1.474*** (0.120)
AGE -0.009(0.153) 0.582*** (0.100) 1.712*** (0.695)
EDU -0.030*** (0.012) -0.013 (0.009) -0.002 (0.030)
MARRIED -0.028(0.034) 0.016 (0.030) -0.113 (0.093)
Ln EMPL 0.145***(0.027) 0.128*** (0.022) 0.111* (0.063)
Obs. 22,064 26,570 2,700
Log pseudo-likelihood -12,594 -16,408 -1,748
McFadden R-sq. 0.074 0.065 0.052
Table 4. Robustness – Disaggregation by rejection period and age bucket. Note: The table shows the results from estimating equation 1, separately by loan stringency (dependent variable) and age bucket (key independent variable). These age buckets are indicated at the top of the column. The variable definitions and notations are in Table 1. Standard errors (clustered by customer) in parentheses ***p<0.01;** p<0.05; *p<0.10

Two principal findings emerge. First, evidence of favourable treatment—whereby women with higher credit scores face lower rejection probabilities—is apparent in the younger (21–35 years) and middle-aged (36–50 years) groups, but not among older applicants (above 50 years). A similar pattern holds for relative income, suggesting that the mitigating effect of income on loan rejection diminishes with age. Second, in both dimensions, the impact is most substantial among the 21–35-year-old cohort: young women with higher credit scores and applicants with above-average relative incomes in this age group are the least likely to be rejected for a loan.

In Table 5, we further disaggregate the key dependent and independent variables, segregating the analysis not only by age group but also, in the process, using progressively less stringent definitions of the outcome variable (as indicated in the column headers). For parsimony, control variables are not reported. Overall, the Average Marginal Effects (AMEs) align with the baseline findings, though their magnitude varies across age cohorts.

Dep var REJECT 3M (maximum stringent) REJECT 6M (moderately stringent) REJECT 12M (least stringent)
Age bucket (years, Y) 21-35 36-50 Above 50 21-35 36-50 Above 50 21-35 36-50 Above 50
(1) (2) (3) (4) (5) (6) (7) (8) (9)
CSCORE 0.150*** (0.007) 0.048*** (0.006) 0.104 (1.976) 0.115*** (0.006) 1.254*** (0.474) -0.039*** (0.013) 0.071*** (0.006) -1.850*** (0.449) -0.034*** (0.013)
AME 0.027 0.007 0.026 0.287 -0.744 0.014 -0.396 -0.744
FEM 0.107 (0.164) 0.195 (0.120) -0.215 (0.355) 0.182 (0.135) 0.092 (0.088) 0.129 (0.267) 0.346*** (0.129) 0.148* (0.079) 0.129 (0.267)
AME 0.032 0.067 0.032
CSCORE*FEM -0.016 (0.025) -0.029* (0.018) 0.033 (0.054) -0.028 (0.021) -0.014 (0.013) -0.019 (0.041) -0.053*** (0.019) -0.023* (0.012) -0.019 (0.041)
AME -0.005 -0.010 -0.050
REL_INC -0.076 (0.062) -0.205*** (0.031) -0.178* (0.105) -0.089 (0.059) -0.218*** (0.028) -0.163 (0.103) -0.273*** (0.041) -0.221*** (0.033) -0.163 (0.103)
AME -0.033 -0.026 -0.050 -0.053 -0.047
Controls Y Y Y Y Y Y Y Y Y
Obs. 22,044 26,551 2,699 22,044 26,551 2,699 22,044 26,551 2,699
Log pseudo-likelihood -11,765 -13,316 -1,242 -14,526 -17,316 -1,702 -12,563 -16,384 -1,702
McFadden R-sq. 0.041 0.022 0.008 0.044 0.033 0.023 0.075 0.065 0.023
Table 5. Robustness – Disaggregation by rejection period and age bucket. Note: The table shows the results from estimating equation 1, separately by loan stringency (dependent variable) and age bucket (key independent variable). These variables are indicated at the top of the column. The variable definitions and notations are in Table 1. Standard errors (clustered by customer) in parentheses ***p<0.01;** p<0.05; *p<0.10

Notably, the loan rejection probability for women with high credit scores is lowest in the 36–50 years age group. However, under the “moderately stringent” rejection definition, this effect becomes statistically indistinguishable, indicating that the favourable treatment is primarily concentrated in the “most stringent” and “least stringent” categories.

Regarding statistical discrimination, the findings remain consistent with the baseline: a higher relative income significantly reduces the likelihood of loan rejection. In the “most stringent” category, the AMEs are statistically significant in the older age brackets—indicating that applicants aged “36–50 years” and “above 50 years” have 3.3% and 2.6% lower likelihoods of rejection, respectively. In the “least stringent” category, where the results are statistically significant, the magnitudes are generally larger, underscoring the persistent importance of income.

Finally, we turn to heterogeneity across loan types. In this respect, we consider four categories: home loans, gold loans, unsecured loans, and other personal loans. Collectively, these categories span a broad spectrum of maturity structures (e.g., home loans vs. personal loans) and collateral requirements (e.g., gold loans vs. unsecured loans). The estimation findings are summarised in Table 6.

Loan category HSNG_ACC OTH_ACC UNSEC_ACC GOLD_ACC
(1) (2) (3) (4)
CSCORE -0.644 (1.256) 0.305*** (0.039) 0.309***(0.044) -0.817*** (0.065)
AME 0.585 0.052 -0.131
FEM -0.016 (0.209) 0.088 (0.079) 0.169** (0.077) 0.402*** (0.134)
AME 0.029 0.065
CSCORE*FEM 0.003 (0.032) -0.014 (0.012) -0.026** (0.011) -0.062*** (0.020)
AME -0.004 -0.009
REL_INC -0.131** (0.061) -0.287*** (0.029) -0.184*** (0.027) -0.216*** (0.065)
AME -0.028 -0.055 -0.031 -0.035
Controls Y Y Y Y
Obs. 2,715 39,424 33,356 13,917
Log pseudo-likelihood -1,687 -22,329 -17,202 -6,762
McFadden R-sq. 0.023 0.070 0.052 0.195
Table 6. Robustness – Disaggregation by loan types. Note: The table shows the results from estimating equation (1) for various loan categories, as indicated at the top of each column. The dependent variable is 1 if a loan is rejected and 0 otherwise. The variable definitions and notations are in Table 1. Standard errors (clustered by customer) in parentheses ***p<0.01;** p<0.05; *p<0.10

Regarding taste-based discrimination, the direction of the coefficients remains consistent with the baseline findings for unsecured (column 3) and gold (column 4) loans. Specifically, the AME for gold loans indicates that female applicants with high credit scores are 1% less likely to be rejected for a loan. A key reason for the persistence of taste-based discrimination in gold loans is the very asset that backs them. Since the collateral is objective (gold jewellery), lenders need fewer additional "objective" metrics, such as credit scores or income statements, for statistical discrimination, and in the process leave room for personal biases, prejudices, and ‘taste’ to influence the lending process. In addition, gold is considered a safe-haven asset, having appreciated by over 70% between 2013 and 2024. The security of the collateral and its rising value reduce assessment and valuation risks for banks. This is further corroborated by recent data showing a fivefold increase in gold loans since 2019, predominantly driven by women borrowers, who account for nearly 40% of such loans (TransUnion CIBIL–NITI Aayog, 2025).

Consistent with earlier findings, higher income significantly reduces the likelihood of loan denial. The AMEs reveal that rejection rates are lowest for borrowers with higher relative income, with the most potent effects observed for "other" personal loans (column 2), and to a lesser extent, for housing loans (column 1). The comparatively higher rejection rate for housing loans may reflect factors beyond credit scores—such as employment type, incomplete documentation, or external issues including unapproved builders, flawed valuations, or properties located in restricted zones.

5. Concluding remarks

Using cross-sectional data, this study investigates the presence of taste-based and statistical discrimination in consumer lending in India. The results provide robust evidence of statistical discrimination, in which higher relative income significantly reduces the likelihood of loan rejection. In contrast, the findings on taste-based discrimination diverge from conventional literature, suggesting a pattern of favourable treatment: female applicants with higher credit scores are less likely to be rejected for loans. This duality of outcomes underscores the nuanced nature of discrimination in consumer lending, highlighting its variation across demographic cohorts and loan categories.

From a policy angle, these findings provide both country-specific and broader implications. As regards the former, the analysis needs to be viewed in light of the RBI’s Fair Practices Code (FPC) and updated in line with subsequent guidelines, including the 2025 Digital Lending Directions. This evidence of a bifurcated system in which only women with higher credit scores are less likely to be rejected underscores inconsistent enforcement: lower-scored women's systemic disadvantages, compromising the FPC's transparency and non-discrimination clauses. The 2025 Digital Lending Directions further amplify this linkage by requiring Key Facts Statements (KFS), cooling-off periods, and impartiality in algorithmic loan matching, yet they appear less responsive to gender-specific biases in scoring algorithms. Drawing from evidence which highlights women as safer borrowers (Hermes & Lensink, 2011; Srivastava, 2013; Cowling et al., 2020), there is a need to move beyond traditional, one-dimensional credit scoring models, integrating alternative data sources into credit assessments to provide a more holistic picture of creditworthiness, reducing reliance on traditional scores that disproportionately penalise women in high-gender-bias countries.

Second, and more generally, targeted amendments informed by international best practices, such as the U.S. Consumer Financial Protection Bureau's (CFPB) Equal Credit Opportunity Act guidelines, can be explored. This can, among other things, mandate annual gender-disaggregated reporting of credit scores, approval rates, loan amounts, default outcomes, and interest differentials, thereby expanding disclosure requirements to include bias metrics for accountability (e.g., Liu and Liang, 2025). This can facilitate a holistic assessment of observed discrimination against lower-scored women in scoring models. Enforcing periodic audits of algorithmic lending systems, incorporating alternative ‘soft’ data such as utility payments, rental history, and digital footprints to provide a holistic creditworthiness assessment, as piloted in European microfinance and recommended by Berg et al. (2020), can also be considered.

A couple of limitations of the analysis are in order. The analysis is based on cross-sectional data. As a result, it can at best only establish correlation, not causation. Relatedly, credit scores are endogenous and can be influenced by historical factors, such as past financial discrimination or limited access to formal credit. This endogeneity introduces bias, as the credit scores used to predict current loan outcomes may already reflect the very systemic biases the analysis seeks to understand. Consequently, the observed association between a low credit score and loan rejection might reflect embedded historical inequities rather than a true reflection of current creditworthiness. Addressing these challenges through better, more comprehensive data can help navigate the underlying issues and facilitate informed policymaking.

Funding: This research received no external funding.

Data Availability Statement: The data is proprietary and is sourced from a leading credit bureau in India.

Conflicts of Interest: The author declares no conflicts of interest.

AI Use Statement: The authors confirm that no AI tools were used in the writing, editing, data analysis, or figure generation of this manuscript.

Disclaimer: All statements, viewpoints, and data featured in the publications are exclusively those of the individual author(s) and contributor(s), not of MFI and/or its editor(s). MFI and/or the editor(s) absolve themselves of any liability for harm to individuals or property that might arise from any concepts, methods, instructions, or products mentioned in the content.