[Intel-wired-lan] [PATCH v4] igb: Assign random MAC address instead of fail in case of invalid one
Alexander H Duyck
alexander.duyck at gmail.com
Mon Jun 6 17:04:13 UTC 2022
On Mon, 2022-06-06 at 22:35 +0800, 梁礼学 wrote:
> Hi,
> thank you very much for your suggestion.
>
> As you said, the way to cause ‘Invalid MAC address’ is not only through igb_set_eeprom,
> but also some pre-production or uninitialized boards.
>
> But if set by module parameters, especially in the case of CONFIG_IGB=y,
> The situation may be more troublesome, because for most users, if the system does not properly load and generate
> the network card device, they can only ask the host supplier for help.But,
A module parameter can be passed as a part of the kernel command line
in the case of CONFIG_IGB=y. So it is still something that can be dealt
with via module parameters.
> (1) If the invalid mac address is caused by igb_set_eeprom, it is relatively more convenient for most operations engineers
> to change the invalid mac address to the mac address they think should be valid by ethtool, which may still be Invalid.
> At this time,assigned random MAC address which is valid by the driver enables the network card driver to continue to complete the loading.
> As for what you mentioned, in this case if the user does not notice that the driver had used a random mac address,
> it may lead to other problems.but the fact is that if the user deliberately sets a customized mac address,
> the user should pay attention to whether the mac address is successfully changed, and also pay attention to the
> expected result after changing the mac address.When users find that the custom mac address cannot
> be successfully changed to the customized one, they can continue debugging, which is easier than looking for
> the host supplier’s support from the very first time of “Invalid MAC address”.
The problem is, having a random MAC address automatically assigned
makes it less likely to detect issues caused by (1). What I have seen
in the past is people program EEPROMs and overwrite things like a MAC
address to all 0s. This causes an obvious problem with the current
driver. If it is changed to just default to using a random MAC address
when this occurs the issue can be easily overlooked and will likely
lead to more difficulty in trying to maintain the device as it becomes
harder to identify if there may be EEPROM issues.
> (2) If the invalid mac address is caused during pre-production or initialization of the board, it is even more necessary
> to use a random mac address to complete the loading of the network card, because the user only cares about whether
> the network card is loaded, not what the valid MAC address is.
This isn't necessarily true. What I was getting at is that in the pre-
production case there may not be an EEPROM even loaded and as one of
the initial steps it will be necessary to put one together for the
device.
The user could either make the module parameter permenant and have it
used for every boot, or they might just have to set it once in order to
load a valid EEPROM image on the system.
> And I also noticed that ixgbvef_sw_init also uses a random valid mac address to continue loading the driver when
> the address is invalid. In addition, network card drivers such as marvell, broadcom, realtek, etc., when an invalid
> MAC address is detected, it also does not directly exit the driver loading, but uses a random valid MAC address.
The VF drivers assign a random MAC address due to more historic reasons
than anything else. In addition generally the use of the random MAC
address is more-or-less frowned upon. There is logic in ixgbevf that
will cause the PF to reject the VF MAC address and overwrite the MAC
address from the PF side.
As far as the other drivers they have their reasons. In many cases I
suspect the driver is intended for an embedded environment where the
user might not be able to reach the device if the NIC doesn't come up.
The igb driver is meant to typically be used in a desktop environment.
Catching a malformed MAC address is important as a part of that as it
is one of the health checks for the device. That is why I am open to
supporting it by default, but only if it is via a module parameter to
specify the behavior. Otherwise we are changing a key piece of driver
behavior and will be potentially masking EEPROM issues.
More information about the Intel-wired-lan
mailing list